Facebook Twiter Goole Plus Linked In YouTube Blogger

Testing - Examinations - Assessments

Everyone needs to fully understand just how ineffective some tests are, and that they can be dangerously flawed. I don't want some child's life ruined because they did not pass a particular test. And I certainly don't want a child to be mislead either just because they did pass a test. Testing should only be a guide, testing should not be a form of judgment, or a measure of worth, or be used to manipulate or mislead someone, in the same way that praise sometimes does.

Previous SubjectNext Subject

Will this be on the Test ?

I passed a lot of tests in the schools that I attended, and I still walked away knowing very little about myself and the world around me. If the questions on the test don't matter, then getting a right or wrong answer does not matter. So just getting the answers right does not say that the knowledge is important, or does it guarantee that you will fully understand the knowledge and information enough in order to use it effectively and efficiently. Every school on the planet needs to teach this as a fact, if they don't, then every student walks away with a false sense of security, which could have some devastating consequences that we can clearly see all around the world today in the form of wars, thousands of crimes being committed, millions of people with diseases, hundreds of social problems, and nonstop corruption, just to name a few.

Students performed significantly better on tests they chose to take than on tests they were forced to take.

Life is your test, if you don't learn how to pass this test, then mistakes and failures will follow. So who failed? Me or the teacher?

Now the test has begun...

People should never feel threatened by tests, as if their intelligence is some how being judged. Testing should only be a tool for learning. Testing should never be used as a weapon for judging, or controlling and manipulating, or used for discriminating people. Testing should only be a tool to encourage development, and not be used to discourage development. Testing should only be a tool to measure progress, and not be used to impede progress. 

Life is Not a Test, Life is a form of learning, adapting and Surviving.

Surviving is to live through hardship or adversity. Support oneself. Stay in Existence. Live

"Instead of failing because you got half the answers wrong, you should instead be saying, I got half the answers correct, that means I'm halfway there."

"Measurable Impacts on Learning"

If people only learn things just to pass a test, then they have already failed the test. You should learn things because they are valuable, knowledge that you will use in your life. If learning is only to pass a test, then most people will stop learning because they believe there is no benefit, like a test grade. Being good just for the reward

You study to pass a test or do you study to learn? Testing should have a reason. It should accurately measure a persons understanding of a subject, so they can understand the knowledge well enough in order to apply it to a particular real life action. When you take a driving test you are proving that you know how to drive safely and effectively. When you take a math test, you are proving that you know how to use math effectively to solve real world problems and make accurate predictions. If learning to pass a test does not give you knowledge that you will use to benefit yourself and to benefit others, then what's the incentive? What's the purpose? Why would I study? Testing should be goal-directed.

There is way too much confusion about testing and its true purpose.

If the Test itself is flawed, then what's the value in passing the test?

Passing a test usually means only one thing, that you passed a particular test. So what does that mean? What does the test really confirm? What are the benefits from that test? What did you learn that was valuable from that test?

Guide is something that offers basic information or instruction. A structure or marking that serves to direct the motion or positioning of something. A model or standard for making comparisons. Someone who can find paths through unexplored territory. Direct the course; determine the direction of travelling. Be a guiding or motivating force or drive. Someone who shows the way by leading or advising.

Evaluation - Performance - Ratings

Appraisal is an expert estimation of the quality, quantity, value and other characteristics of someone or something.

Assessment is the classification of someone or something with respect to its worth.

Educational Assessment is the process of documenting knowledge, skills, attitudes, and beliefs. The systematic process of documenting and using empirical data on the knowledge, skill, attitudes, and beliefs to refine programs and improve student learning. Assessment data can be obtained from directly examining student work to assess the achievement of learning outcomes or can be based on data from which one can make inferences about learning. Assessment is often used interchangeably with test, but not limited to tests. Assessment can focus on the individual learner, the learning community (class, workshop, or other organized group of learners), a course, an academic program, the institution, or the educational system as a whole (also known as granularity). As a continuous process, assessment establishes measurable and clear student learning outcomes for learning, provisioning a sufficient amount of learning opportunities to achieve these outcomes, implementing a systematic way of gathering, analyzing and interpreting evidence to determine how well student learning matches expectations, and using the collected information to inform improvement in student learning. The final purpose of assessment practices in education depends on the theoretical framework of the practitioners and researchers, their assumptions and beliefs about the nature of human mind, the origin of knowledge, and the process of learning.

Concept Inventory is a criterion-referenced test designed to help determine whether a student has an accurate working knowledge of a specific set of concepts. Historically, concept inventories have been in the form of multiple-choice tests in order to aid interpretability and facilitate administration in large classes. Unlike a typical, teacher-authored multiple-choice test, questions and response choices on concept inventories are the subject of extensive research. The aims of the research include ascertaining (a) the range of what individuals think a particular question is asking and (b) the most common responses to the questions. Concept inventories are evaluated to ensure test reliability and validity. In its final form, each question includes one correct answer and several distractors. Ideally, a score on a criterion-referenced test reflects the amount of content knowledge a student has mastered. Criterion-referenced tests differ from norm-referenced tests in that (in theory) the former is not used to compare an individual's score to the scores of the group. Ordinarily, the purpose of a criterion-referenced test is to ascertain whether a student mastered a predetermined amount of content knowledge; upon obtaining a test score that is at or above a cutoff score, the student can move on to study a body of content knowledge that follows next in a learning sequence. In general, item difficulty values ranging between 30% and 70% are best able to provide information about student understanding. The distractors are incorrect or irrelevant answers that are usually (but not always) based on students' commonly held misconceptions. Test developers often research student misconceptions by examining students' responses to open-ended essay questions and conducting "think-aloud" interviews with students. The distractors chosen by students help researchers understand student thinking and give instructors insights into students' prior knowledge (and, sometimes, firmly held beliefs). This foundation in research underlies instrument construction and design, and plays a role in helping educators obtain clues about students' ideas, scientific misconceptions, and didaskalogenic ("teacher-induced" or "teaching-induced") confusions and conceptual lacunae that interfere with learning.

Authentic Assessment is the measurement of "intellectual accomplishments that are worthwhile, significant, and meaningful" as contrasted to multiple choice standardized tests. Authentic assessment can be devised by the teacher, or in collaboration with the student by engaging student voice. When applying authentic assessment to student learning and achievement, a teacher applies criteria related to “construction of knowledge, disciplined inquiry, and the value of achievement beyond the school.” Authentic assessment tends to focus on contextualised tasks, enabling students to demonstrate their competency in a more 'authentic' setting. Examples of authentic assessment categories include: Performance of the skills, or demonstrating use of a particular knowledge. Simulations and role plays, studio portfolios, strategically selecting items.

Educational Measurement refers to the use of educational assessments and the analysis of data such as scores obtained from educational assessments to infer the abilities and proficiencies of students. The approaches overlap with those in psychometrics. Educational measurement is the assigning of numerals to traits such as achievement, interest, attitudes, aptitudes, intelligence and performance.

Level of Measurement is a classification that describes the nature of information within the numbers assigned to variables. Psychologist Stanley Smith Stevens developed the best known classification with four levels, or scales, of measurement: nominal, ordinal, interval, and ratio. This framework of distinguishing levels of measurement originated in psychology and is widely criticized by scholars in other disciplines.

Statistics (math)

Health Assessment is a plan of care that identifies the specific needs of the client and how those needs will be addressed by the healthcare system. Mental Health Assessments

Library Assessment is a process undertaken by libraries to learn about the needs of users (and non-users).

Nursing assessment is the gathering of information about a patient's physiological, psychological, sociological, and spiritual status.

Political assessment is an assessment of officeholders for political donations.

Psychiatric Assessment is a process of gathering information about a person within a psychiatric or mental health service with the purpose of making a diagnosis.

Psychological Assessment is an examination into a person's mental health by a mental health professional such as a psychologist.

Risk Assessment is the determination of quantitative or qualitative value of risk related to a concrete situation and a recognized threat.

Survey Data Collection are marketing assessments.

Tax Assessment is the value calculated as the basis for determining the amounts to be paid or assessed for tax or insurance purposes.

Vulnerability Assessment is he process of identifying, quantifying, and prioritizing (or ranking) the vulnerabilities in a system, i.e., IT, water purification, transportation.

Writing Assessment is an area of study within composition studies that looks at the practices, technologies, and process of using writing to assess performance and potential.

Examination is the act of examining something closely (as for mistakes). A set of questions or exercises evaluating skill or knowledge. A detailed inspection of your conscience. The act of giving students or candidates a test (as by questions) to determine what they know or have learned.

Test is trying something to find out about it. Measuring sensitivity or memory or intelligence or aptitude or personality, etc.
Put to the test, as for its quality, or give experimental use to.

Acceptance Testing is a test conducted to determine if the requirements of a specification or contract are met. It may involve chemical tests, physical tests, or performance tests.

Accreditation (degrees)

Test Assessment is an assessment intended to measure a test-taker's knowledge, skill, aptitude, physical fitness, or classification in many other topics (e.g., beliefs). A test may be administered verbally, on paper, on a computer, or in a confined area that requires a test taker to physically perform a set of skills. Tests vary in style, rigor and requirements.

Test Method is a definitive procedure that produces a test result. A test can be considered a technical operation or procedure that consists of determination of one or more characteristics of a given product, process or service according to a specified procedure. Often a test is part of an experiment. The test result can be qualitative (yes/no), categorical, or quantitative (a measured value). It can be a personal observation or the output of a precision measuring instrument. Usually the test result is the dependent variable, the measured response based on the particular conditions of the test or the level of the independent variable. Some tests, however, may involve changing the independent variable to determine the level at which a certain response occurs: in this case, the test result is the independent variable.

We created Divers Licenses so that people would have to learn how to drive so they would not destroy property or kill themselves or kill other people. But we don't have a test and a license that says you know how to live a healthy life. A person should know how to navigate life and not just know how to navigate a vehicle. A person should know how to operate their body and mind and not just know how to operate a vehicle.

Experiments (science)

Notability (academics) guideline, sometimes referred to as the professor test, is meant to reflect consensus about the notability of academics as measured by their academic achievements. For the purposes of this guideline, an academic is someone engaged in scholarly research or higher education, and academic notability refers to being known for such engagement.

E-Assessment is the use of information technology in various forms of assessment such as educational assessment, health assessment, psychiatric assessment, and psychological assessment. This may utilize an online computer connected to a network. This definition embraces a wide range of student activity ranging from the use of a word processor to on-screen testing. Specific types of e-assessment include multiple choice, online/electronic submission, computerized adaptive testing and computerized classification testing. Different types of online assessments contain elements of one or more of the following components, depending on the assessment's purpose: formative, diagnostic, or summative. Instant and detailed feedback may (or may not) be enabled. In education assessment, large-scale examining bodies find the journey from traditional paper-based exam assessment to fully electronic assessment a long one. Practical considerations such as having the necessary IT hardware to enable large numbers of student to sit an electronic examination at the same time, as well as the need to ensure a stringent level of security (for example, see: Academic Dishonesty) are among the concerns that need to resolved to accomplish this transition.

Diagnostics (health assessments and examinations)

Assessment Errors (observation flaws)

Evaluation is a systematic determination of a subject's merit, worth and significance, using criteria governed by a set of standards or using accurate information and up to date knowledge. It can assist an organization, program, project or any other intervention or initiative to assess any aim, realisable concept/proposal, or any alternative, to help in decision-making; or to ascertain the degree of achievement or value in regard to the aim and objectives and results of any such action that has been completed. The primary purpose of evaluation, in addition to gaining insight into prior or existing initiatives, is to enable reflection and assist in the identification of future change.

Measurement (math)

Types of Tests (wiki)

Psychological Testing is an instrument designed to measure unobserved constructs, also known as latent variables. Psychological tests are typically, but not necessarily, a series of tasks or problems that the respondent has to solve. Psychological tests can strongly resemble questionnaires, which are also designed to measure unobserved constructs, but differ in that psychological tests ask for a respondent's maximum performance whereas a questionnaire asks for the respondent's typical performance. A useful psychological test must be both valid (i.e., there is evidence to support the specified interpretation of the test results) and reliable (i.e., internally consistent or give consistent results over time, across raters, etc.).
Psychological Assessment, crazy people testing normal people, how ironic.

Psychometrics is a field of study concerned with the theory and technique of psychological measurement.

Latent Variable are variables that are not directly observed but are rather inferred (through a mathematical model) from other variables that are observed (directly measured). Mathematical models that aim to explain observed variables in terms of latent variables are called latent variable models. Latent variable models are used in many disciplines, including psychology, economics, engineering, medicine, physics, machine learning/artificial intelligence, bioinformatics, natural language processing, econometrics, management and the social sciences.

Observable Variable is a variable that can be observed and directly measured.

Physical Fitness Test is a test designed to measure physical strength, agility, and endurance. They are commonly employed in educational institutions as part of the physical education curriculum, in medicine as part of diagnostic testing, and as eligibility requirements in fields that focus on physical ability.
Physical Fitness Test (gov)

Driving Test is a procedure designed to test a person's ability to drive a motor vehicle. A driving test generally consists of one or two parts: the practical test, called a road test, used to assess a person's driving ability under normal operating conditions, and/or a written or oral test (theory test) to confirm a person's knowledge of driving and relevant rules and laws.

Vulnerability Assessment (planning)

Peer Assessment (social learning)

Equating refers to the statistical process of determining comparable scores on different forms of an exam. It can be accomplished using either classical test theory or item response theory. In item response theory, equating is the process of placing scores from two or more parallel test forms onto a common score scale. The result is that scores from two different test forms can be compared directly, or treated as though they came from the same test form. When the tests are not parallel, the general process is called linking. It is the process of equating the units and origins of two scales on which the abilities of students have been estimated from results on different tests. The process is analogous to equating degrees Fahrenheit with degrees Celsius by converting measurements from one scale to the other. The determination of comparable scores is a by-product of equating that results from equating the scales obtained from test results.

Rubric is a scoring guide used to evaluate the quality of students' constructed responses". Rubrics usually contain evaluative criteria, quality definitions for those criteria at particular levels of achievement, and a scoring strategy. They are often presented in table format and can be used by teachers when marking, and by students when planning their work.

IQ Testing (intelligence)

Bias in Mental Testing - IQ tests are Culturally Biased

History of the Race and Intelligence Controversy concerns the historical development of a debate, concerning possible explanations of group differences encountered in the study of race and intelligence. Since the beginning of IQ testing around the time of World War I there have been observed differences between average scores of different population groups, but there has been no agreement about whether this is mainly due to environmental and cultural factors, or mainly due to some genetic factor, or even if the dichotomy between environmental and genetic factors is the most effectual approach to the debate.

Cattell Culture Fair III measure of cognitive abilities that accurately estimated intelligence devoid of sociocultural and environmental influences.

Purdue Spatial Visualization Test: Visualization of Rotation  is like most measures of spatial ability, the PSVT:R shows sex differences. A meta-analysis  of 40 studies found a Hedges's g of 0.57 in favor of males.

The Chitling Test was designed to demonstrate differences in understanding and culture between races, specifically between African Americans and Whites. In determining how streetwise someone is, the Chitling Test may have validity, but there have been no studies demonstrating this. Furthermore, the Chitling Test has only proved valid as far as face validity is concerned; no evidence has been brought to light on the Chitling predicting performance.

Requirements Analysis determining the needs or conditions to meet for a new or altered product or project.

Physical Test is a qualitative or quantitative procedure that consists of determination of one or more characteristics of a given product, process or service according to a specified procedure. Often this is part of an experiment.

Usability Testing is a technique used in user-centered interaction design to evaluate a product by testing it on users. This can be seen as an irreplaceable usability practice, since it gives direct input on how real users use the system. This is in contrast with usability inspection methods where experts use different methods to evaluate a user interface without involving users. Usability testing focuses on measuring a human-made product's capacity to meet its intended purpose. Examples of products that commonly benefit from usability testing are foods, consumer products, web sites or web applications, computer interfaces, documents, and devices. Usability testing measures the usability, or ease of use, of a specific object or set of objects, whereas general human-computer interaction studies attempt to formulate universal principles.

Acceptance Testing is a test conducted to determine if the requirements of a specification or contract are met. It may involve chemical tests, physical tests, or performance tests, which is an assessment that requires an examinee to actually perform a task or activity, rather than simply answering questions referring to specific parts. The purpose is to ensure greater fidelity to what is being tested. Quality Control

Testing Maturity Model has five Levels. Level 1 – Initial: At this level an organisation is using ad hoc methods for testing, so results are not repeatable and there is no quality standard. Level 2 – Definition: At this level testing is defined as a process, so there might be test strategies, test plans, test cases, based on requirements. Testing does not start until products are completed, so the aim of testing is to compare products against requirements. Level 3 – Integration: At this level testing is integrated into a software life cycle, e.g. the V-model. The need for testing is based on risk management, and the testing is carried out with some independence from the development area. Level 4 – Management: and measurement At this level testing activities take place at all stages of the life cycle, including reviews of requirements and designs. Quality criteria are agreed for all products of an organisation (internal and external). Level 5 – Optimization: At this level the testing process itself is tested and improved at each iteration. This is typically achieved with tool support, and also introduces aims such as defect prevention through the life cycle, rather than defect detection (zero defects).
Each level from 2 upwards has a defined set of processes and goals, which lead to practices and sub-practices.

High-Stakes Test is a test with important consequences for the test taker. Passing has important benefits, such as a high school diploma, a scholarship, or a license to practice a profession. Failing has important disadvantages, such as being forced to take remedial classes until the test can be passed, not being allowed to drive a car, or not being able to find employment.

Standards-Based Assessment is assessment that relies on the evaluation of student understanding with respect to agreed-upon standards, also known as "outcomes". The standards set the criteria for the successful demonstration of the understanding of a concept or skill.

Educational Assessment is the process of documenting, usually in measurable terms, knowledge, skill, attitudes, and beliefs. It is a tool or method of obtaining information from tests or other sources about the achievement or abilities of individuals.

Skills Analysis (aptitude)


Final Exam is a test given to students at the end of a course of study or training.

Term Paper is a research paper written by students over an academic term, accounting for a large part of a grade. Term papers are generally intended to describe an event, a concept, or argue a point. A term paper is a written original work discussing a topic in detail, usually several typed pages in length and is often due at the end of a semester.

Passing a driving test does not say that you will be a good driver, it only says that you are allowed to drive a car.

Passing a math test does not say that you will count the things that matter, it only says that you know how to count in a particular way.

Passing an English test doesn't mean that you will speak and write valuable things, it only says that you can speak and write.

Passing a reading test does not say that you fully understand all the text that you read, it only says that you understood what was written on a particular test.

Passing the Bar Exam doesn't say that you will be a good lawyer, or does it say that you will be an honest lawyer, it only says that you can practice law.

Everyone needs to fully understand how ineffective some tests are, and that some tests are even inaccurate and misleading. Tests are dangerous because they could give a person a false sense of accomplishment, they could also give a false sense of security, and tests could also give some people a false sense of importance and value. You are literally being coned and taken for a fool. And the only way to stop this abuse, is to create more accurate tests. Tests that are proven to work. So the test itself must be certified, and certified in an open forum of experts from around the world. We need a stamp of approval, something that guarantees authenticity. Then we could start accurately measuring intelligence and abilities, which is the Goal of BK101.

Teaching to a Test is the same thing as teaching ideology, it's the same thing as teaching someone's version of reality, it's the same thing as teaching someone's personal belief, that's not education, that's indoctrination, without any vision, without any responsibility or accountability. It is simply wrong, inaccurate, ineffective, inefficient, and f*cking criminal, and you need to stop it.
Teaching to the Test (wiki)

Stop forcing children to produce answers for teachers and tests, because it disconnects children from learning. Children should learn how to produce answers for themselves, so that they understand that learning is their responsibility and no one else's. You give them problems to solve, you give them the tools and resources to locate needed information. Ask a question like how long would the sound of your voice take to travel around the earth? And see what they come up with and show them the answers they needed to know in order to come up with answer. Don't force them to memorize, teach them how to learn. Test them on how much they understand about themselves and the world around them. Stop testing them for things that teachers force students to memorize.

Mastering a subject has become less about learning and more about performance. If you ask most students what they think their role is in math classrooms, they will tell you it is to get questions right.

Everyone should have the answers to their questions, but everyone should also know how the answers were calculated and figured out manually.

My child is an honor student is such a stupid saying. It would be better to say, "My child is Totally Awesome, and Smart" The value of your child is not measured by schools, or grades. The value of a child is measured by their abilities and the qualities they have learned in their life. Life is the only true measurer of a person. It's what you learned in life, and it's not just what you in schools.

We need to use less generalized words like "good job", especially after five years of age, when children get older you should speak in more details, when they do something good you should say. "I like what you did there and this is why I like it", instead of just saying "good job." And also remind them that your opinion is only one opinion, and not an expert opinion, unless of course you are an expert, but even then, you should still encourage your child to always get a second opinion, if they can.

Same thing for calling your kid smart, tell them why you think they are smart, and that being smart doesn't mean they will stop making mistakes, it just means that the fear of making mistakes will be less, and that they will also have a better chance of learning from from their mistakes, but only if they keep learning.

It doesn't matter what school you come from, what matters is what you have learned so far? And what you plan to learn more about now? And in the the future? And why? You must choose how you want to benefit society, and don't let others make the choice for you, because it may not be what you want to do with your life.

The test has to have answers to real problems, so that when you get the answer right, you know how to solve a real problem, and not just know how to solve a problem on paper, or on a computer screen, which you will most likely forget about.
You need to be able to visualize something, you have to be able to relate to something, if not, then you will most likely forget what you have learned. This is one of the failures of testing, no one remembers all the questions and answers on a test, or if the test was relevant to their life or to their abilities.

Learning by Association and Counting the things That Matter.

Timing of Tests, if you're not ready, then you will do poorly. But if the test was only a guide to see how much you know and don't know, then the test would not be a do or die situation. A student has the right to choose their own speed of learning, as long as they are aware of top speeds, this way they can have something to compare to, and not judged by, only compare.

What good is passing a test if you just end up forgetting everything about the test later on? You should remember a test. You should remember that day as being a day that you learned something valuable, or at the least, a day that you confirmed that you learned something valuable, and you should always be able to remember what you have learned...That is what testing should be. If it is not, then you're mostly just wasting time, potential, energy and resources. And the nerve that schools make people pay for that, how f*cking dare you. So please step away from the child, you have no right to teach.

Testing has sadly become a weapon of control and manipulation and a deceitful method of misinformation.

Testing is giving too many people a false sense of accomplishment, and worse, testing is giving too many people a false sense of failure. 

If we carefully examined the test questions you will find that most questions on tests do not measure understanding, or do they measure intelligence, which proves that ignorant and corrupt people should not create tests.

"If you're testing for the things that don't matter then testing doesn't matter....in a way school testing is almost criminal"

"Grades don't determine intelligence, they mostly determine obedience."

High-Stakes Tests a likely factor in STEM performance gap. Performance gaps between male and female students increased or decreased based on whether instructors emphasized or de-emphasized the value of exams.

Reinforcement Dangers

School testing methods are more like a Public Survey then they are an actual Test. The test is more about confirming how ignorant you are. But not to correct your ignorance, but to verify that the schools ignorant teaching methods have succeeded in making students mindless and unaware. Most of todays School Textbooks are filled with propaganda and should only be used as an Index Book. They can also be used to show ignorant teaching methods. So not a total waste of paper, but close.

Knowing the answer to a question on a test only confirms one thing, and that is that you remembered the right answer at the right moment. Knowing the answer to a question on a test doesn't confirm that you understand what the answer means or does it confirm that the question is even valid.

Admission Tests to Colleges

So what does a Grade really tell you? You can have low grades and still be a great person, a great parent and even a great leader, and on the other hand, you can have high grades and end up becoming a criminal or a murderer. If you get a "B" then how does that explain what part of the knowledge you didn't understand?

Grading on a Curve

Testing should not be for telling us how much you know or how much you don’t know, but more importantly, testing should be for telling us how much more knowledge you need to learn. So a test should only be used as a guide and not be used as a prerequisite or for a final conclusion. This is because tests rarely confirm exactly how much knowledge a person has.

This is because the people who design tests are confused about what the tests true intentions should be.

"Testing is a tool used in the learning process. That is why you need answers in writing that explain how the information is being understood. This way a teacher can learn from the students answers, so they can continue to improve the teaching
methods and continue to improve the testing questions."

Testing should consist of real life scenarios, scenarios that people would most likely encounter during their life. This way they can practice problem solving, as well as test their awareness and their focusing abilities. They can also test their ability to predict outcomes, because predicting the future is one of our greatest abilities. You can create some tests in the same style as some video games. Even though it's not real, the knowledge and information a person will learn is real. A Test should also have knowledge and information that can be easily remembered, why else take a test if you're not going to remember what you learned? Other testing would be a persons understanding of symbols, languages, and their uses. People with limited knowledge of symbols and language would have a modified test, one that is still challenging but a little less complex.

Teachers need to be tested a much as students, and the test for teachers needs to be designed by the students.

So your first test will have to be for the people who design tests. The results of this test must be made public so that all the questions and answers that are given can be examined and understood. This way a test can be created that has real purpose and meaning.

Tests need to be more like a lesson that says that this information and knowledge is important.

Multiple Choice is not testing. Multiple Choice is a form of an objective assessment in which respondents are asked to select the only correct answer out of the choices from a list. The multiple choice format is most frequently used in educational testing, in market research, and in elections, when a person chooses between multiple candidates, parties, or policies.

Process of Elimination is a method to identify an entity of interest among several ones by excluding all other entities. In educational testing, the process of elimination is process of deleting options whereby the possibility of option being correct is close to zero or significantly lower compared to other options. The process does not guarantee success, even if only 1 option remains.

Behavior-Driven Development is a software development process that emerged from test-driven development (TDD). Behavior-driven development combines the general techniques and principles of TDD with ideas from domain-driven design and object-oriented analysis and design to provide software development and management teams with shared tools and a shared process to collaborate on software development.

Learning Methods - Teaching Methods

Learning without memory is impossible, but just remembering does not guarantee that you are actually Learning.

Connecticut Mastery Test

Standardized Tests in the United States

Educational Testing Service
Problems with Testing
Reasons Why Standardized Tests Are Not Working
Education Assessment Fact Sheet (PDF)
PARCC Testing
California High School Exit Exam CHSEE

Last Week Tonight with John Oliver: Standardized Testing (youtube)

Fair Test
The National Center for Fair & Open Testing (FairTest)

Educational Records Bureau is the only not-for-profit educational services organization offering assessments for both admission and achievement for independent and selective public schools for Pre K-grade 12.

Pearson Assessments

International Student Assessment (PISA)

Programme for International Student Assessment is a worldwide study by the Organisation for Economic Co-operation and Development (OECD) in member and non-member nations of 15-year-old school pupils' scholastic performance on mathematics, science, and reading. It was first performed in 2000 and then repeated every three years. Its aim is to provide comparable data with a view to enabling countries to improve their education policies and outcomes. It measures problem solving and cognition in daily life.

Scales (PDF)

National Achievement Test is a set of examinations taken in the Philippines by students in Years 6, 10, and 12. Students are given national standardised test, designed to determine their academic levels, strength and weaknesses. Their knowledge learnt throughout the year are divided into 5 categories; English, Filipino, Math, Science and Araling Panlipunan (Social Studies in English) and are tested for what they know. NAT examinations aim to: 1. provide empirical information on the achievement level of pupils/students in Grades Six, Ten, and Twelve to serve as guide for policy makers, administrators, curriculum planners, supervisors, principals and teachers in their respective courses of action. 2. identify and analyze variations on achievement levels across the years by region, division, school and other variables. 3. determine the rate of improvement in basic education with respect to individual schools within certain time frames.

National Achievement Tests
National Assessment Resource

Assessment-Based Accountability System
American College Testing

Standards-Based Assessment

"Public schools waste away their students' lives Teaching to Tests, not because they believe that's the best way for students to learn, but because their credentials depend on test scores". Alex Reid

"The Commodification of Learning"

"Imagine teaching to a test that only confirms 10% of the useful knowledge that a human needs. What ignorant moron would do that? An ignorant moron who was taught that same way of course."

Teaching Assessment  -  Curriculum Standards

Standards-Based Education Reform since the 1980s has been largely driven by the setting of academic standards for what students should know and be able to do. These standards can then be used to guide all other system components. The SBE (standards-based education) reform movement calls for clear, measurable standards for all school students. Rather than norm-referenced rankings, a standards-based system measures each student against the concrete standard. Curriculum, assessments, and professional development are aligned to the standards.

Core Standards
Content Standards
Program for International Student Assessment
Performance Assessment
Learning Teaching and Assessment 
Inspection and Review

Teaching Resources

Teaching Evaluations

Teacher Vision
Learning and Teaching
Teaching Resources
M.E.T. Project

"We learn more by looking for the answer to a question and not finding it than we do from learning the answer itself."
(Lloyd Alexander)

Self Directed Learning
What testing should be like

Automated Essay Scoring is the use of specialized computer programs to assign grades to essays written in an educational setting. It is a method of educational assessment and an application of natural language processing. Its objective is to classify a large set of textual entities into a small number of discrete categories, corresponding to the possible grades—for example, the numbers 1 to 6. Therefore, it can be considered a problem of statistical classification.

The National Council on Measurement in Education is an organization serving assessment professionals. These professionals work in evaluation, testing, program evaluation, and, more generally, educational and psychological measurement. Members come from universities, test development organizations, and industry. A goal of the organization is to ensure that assessment is carried out fairly. (NCME).

False Premise of National Education Standards
What is Assessed Curriculum?

Entrance Examinations - AP

Testing Maturity Model aim to be used in a similar way to CMM, that is to provide a framework for assessing the maturity of the test processes in an organisation, and so providing targets on improving maturity.

A statement in Connecticut's
2006 Mastery Test Mathematics Handbook.
This shows how little some people know. Education will not improve when ignorance is making decisions.

Connecticut Mastery Test (wiki)


The State Board of Education believes that the recent debate pitting the acquisition of basic skills against the development of conceptual understanding argues a false dichotomy. Rather, basic skills and conceptual understanding are intertwined, and both are necessary before students can successfully apply mathematics to the solution of problems. A strong mathematics program will enable students to do each with ease. Unfortunately, not enough students in Connecticut or in the nation are sufficiently developing the facility, understanding, level of confidence and interest in mathematics to meet our present and future societal needs. Therefore, we must fully engage in the quest to provide every student with a strong mathematics program, beginning in the earliest grades.


Instead of beliefs show some proof, facts and examples next time. That's what's called "teaching" by the way. 

How much Math does a person need in order to be Intelligent, Productive and Aware?
When, Why and How Much advanced math should a student learn?
Besides, it's not how much Math you teach, but how you teach it.

Tested is something or someone tested and proved useful or correct. Tested and proved to be reliable, tried and true, well-tried.
Put to the test, as for its quality, or give experimental use to Examine someone's knowledge of something. Determine the presence or properties of (a substance).

Self-Directed Learning

Students are stressed out by tests mostly because most tests are inadequate and irrelevant in measuring useful intelligence or skills. Students will only be excited about test taking when we make tests relevant and accurate in measuring their abilities and their awareness, things that are valuable to them at this time in their life...An A or an F does not matter it the test does not matter.

Performance can not be measured by test scores alone. Real performance can only be accurately measured by a persons actions in life that produce positive outcomes. A test score can only be used as a guide, and not a determining factor of worth. A college degree is only a piece of paper. Positive actions in reality are the only indicators of success.

Tests That Look Like Video Games

Learning Games

Objective Structured Clinical Examination

Objective Structured Clinical Examination is a modern type of examination designed to test clinical skill performance and competence in skills such as communication, clinical examination, medical procedures / prescription, exercise prescription, joint mobilisation / manipulation techniques, radiographic positioning, radiographic image evaluation and interpretation of results. It is a hands-on, real-world approach to learning that keeps you engaged, allows you to understand the key factors that drive the medical decision-making process, and challenges the professional to be innovative and reveals their errors in case-handling and provides an open space for improved decision making based on evidence based practice for real world responsibilities.

An OSCE usually comprises a circuit of short (the usual is 5–10 minutes although some use up to 15 minute) stations, in which each candidate is examined on a one-to-one basis with one or two impartial examiner(s) and either real or simulated (actors or electronic patient simulators) patients. Each station has a different examiner, as opposed to the traditional method of clinical examinations where a candidate would be assigned to an examiner for the entire examination. Candidates rotate through the stations, completing all the stations on their circuit. In this way, all candidates take the same stations. It is considered to be an improvement over traditional examination methods because the stations can be standardized enabling fairer peer comparison and complex procedures can be assessed without endangering patients health.

As the name suggests, an OSCE is designed to be objective - all candidates are assessed using exactly the same stations (although if real patients are used, their signs may vary slightly) with the same marking scheme. In an OSCE, candidates get marks for each step on the mark scheme that they perform correctly, which therefore makes the assessment of clinical skills more objective, rather than subjective, structured - stations in OSCEs have a very specific task. Where simulated patients are used, detailed scripts are provided to ensure that the information that they give is the same to all candidates, including the emotions that the patient should use during the consultation. Instructions are carefully written to ensure that the candidate is given a very specific task to complete. The OSCE is carefully structured to include parts from all elements of the curriculum as well as a wide range of skills. A clinical examination - the OSCE is designed to apply clinical and theoretical knowledge. Where theoretical knowledge is required, for example, answering questions from the examiner at the end of the station, then the questions are standardized and the candidate is only asked questions that are on the mark sheet and if the candidate is asked any others then there will be no marks for them.

Objective Structured Clinical Examination: The Assessment of Choice

The Objective Structured Clinical Examination is a versatile multipurpose evaluative tool that can be utilized to assess health care professionals in a clinical setting. It assesses competency, based on objective testing through direct observation. It is precise, objective, and reproducible allowing uniform testing of students for a wide range of clinical skills. Unlike the traditional clinical exam, the OSCE could evaluate areas most critical to performance of health care professionals such as communication skills and ability to handle unpredictable patient behavior.

The OSCE is a versatile multipurpose evaluative tool that can be utilized to evaluate health care professionals in a clinical setting. It assesses competency, based on objective testing through direct observation. It is comprised of several "stations" in which examinees are expected to perform a variety of clinical tasks within a specified time period against criteria formulated to the clinical skill, thus demonstrating competency of skills and/or attitudes. The OSCE has been used to evaluate those areas most critical to performance of health care professionals, such as the ability to obtain/interpret data, problem-solve, teach, communicate, and handle unpredictable patient behavior, which are otherwise impossible in the traditional clinical examination. Any attempt to evaluate these critical areas in the old-fashioned clinical case examination will seem to be assessing theory rather than simulating practical performance.

Advantages and Disadvantages of OSCE

Written examinations (essays and multiple choices) test cognitive knowledge, which is only one aspect of the competency. Traditional clinical examination basically tests a narrow range of clinical skills under the observation of normally two examiners in a given clinical case. The scope of traditional clinical exam is basically patient histories, demonstration of physical examinations, and assessment of a narrow range of technical skills. It has been shown to be largely unreliable in testing students’ performance and has a wide margin of variability between one examiner and the other. Data gathered by the National Board of Medical Examinations in the USA (1960–1963), involving over 10,000 medical students showed that the correlation of independent evaluations by two examiners was less than 0.25.8 It has also been demonstrated that the luck of the draw in selection of examiner and patient played a significant role in the outcome of postgraduate examinations in psychiatry using the traditional method.6

Published findings of researchers on OSCE from its inception in 1975 to 2004 has reported it to be reliable, valid and objective with cost as its only major drawback. The OSCE however, covers broader range like problem solving, communication skills, decision-making and patient management abilities.

The advantages of OSCE apart from its versatility and ever broadening scope are its objectivity, reproducibility, and easy recall. All students get examined on predetermined criteria on same or similar clinical scenario or tasks with marks written down against those criteria thus enabling recall, teaching audit and determination of standards. In a study from Harvard medical school, students in second year were found to perform better on interpersonal and technical skills than on interpretative or integrative skills. This allows for review of teaching technique and curricula.

Performance is judged not by two or three examiners but by a team of many examiners in-charge of the various stations of the examination. This is to the advantage of both the examinee and the teaching standard of the institution as the outcome of the examination is not affected by prejudice and standards get determined by a lot more teachers each looking at a particular
issue in the training. OSCE takes much shorter time to execute examining more students in any given time over a broader range of subjects.

However no examination method is flawless and the OSCE has been criticized for using unreal subjects even though actual patients can be used according to need. OSCE is more difficult to organize and requires more materials and human resources. Advantages & Disadvantages of OSCE.

How is OSCE done?

OSCE’s basic structure is a circuit of assessment stations, where examiners, using previously determined criteria assess range of practical clinical skills on an objective-marking scheme.

Such stations could involve several methods of testing, including use of multiple choice or short precise answers, history taking, demonstration of clinical signs, interpretation of clinical data, practical skills and counselling sessions among others. Most OSCEs use "standardized patients (SP)" for accomplishing clinical history, examination and counselling sessions. Standardized patients are individuals who have been trained to exhibit certain signs and symptoms of specific conditions under certain testing conditions.

The basic steps in modelling an OSCE exam include:
Determination of the OSCE team.
Skills to be assessed (CE Stations).
Objective marking schemes
Recruitment and training of the standardized patients.
LoLogistics of the examination process.

The OSCE Team

Examiners, marshals and timekeepers are required. Some stations could be unmanned such as those for data or image interpretation but most require an examiner to objectively assess candidate performance based on the pre-set criteria. A reserve examiner who can step in at the last time if required is a good practice. Examiners must be experienced and a standard agreed upon at the outset. Examiners must be prepared to dispense with personal preferences in the interests of objectivity and reproducibility and must assess students according to the marking scheme. Marshals and timekeepers are required for correct movement of candidates and accurate time keeping. OSCE is expensive in terms of manpower requirement.

Skills Assessed in OSCEs

The tasks to be assessed should be of different types and of varying difficulties to provide a mixed assessment circuit. The tasks in OSCE depend on the level of students training. Early in undergraduate training correct technique of history taking and demonstration of physical signs to arrive at a conclusion may be all that is required.

At the end of the training however, testing a broader range of skills, may be required. This could include formulation of a working diagnosis, data and image interpretation, requesting and interpreting investigations, as well as communication skills. Postgraduate medicine may involve more advanced issues like decision taking, handling of complex management issues, counselling, breaking bad news and practical management of emergency situations. There is no hard or fast rules to the skills tested but are rather determined by the aim of assessment. Complex stations for postgraduate student could test varying skills including management problems, administrative skills, handling unpredictable patient behaviour and data interpretation. These assessments and many others are impossible in traditional clinical examination.

Objective marking scheme

The marking scheme for the OSCE is decided and objectively designed. It must be concise, well focused and unambiguous aiming to reward actions that discriminate good performance from poor one. The marking scheme must take cognizance of all possible performances and provide scores according to the level of the student’s performance. It may be necessary to read out clear instructions to the candidates on what is required of them in that station. Alternatively, a written instruction may be kept in the unmanned station.

It is good practice to perform dummy run of the various stations, which enables exam designers to ensure that the tasks can be completed in the time allocated and modify the tasks if necessary. Candidates should be provided with answer booklets for the answers to tasks on the unmanned stations, which should be handed over and marked at the end of the examination.

Recruitment and Training of Standardized or Simulated Patientr
VuVu and Barrows defined standardized patients as "real" or "simulated" patients who have been coached to present a clinical problem. Standardized patients may be professionally trained actors, volunteer simulators or even housewives who have no acting experience. Their use encompasses undergraduate and postgraduate learning, the monitoring of doctors’ performance and standardization of clinical examinations. Simulation has been used for instruction in industry and the military for much longer period, but the first known effective use of simulated patients was by Barrows and Abrahamson (1964), who used them to appraise students’ performance in clinical neurology examinations.

SP candidates must be intelligent, flexible, quick thinking, and reliable. Standardized patients’ understanding of the concept of the OSCE and the role given to them is critical to the overall process.

An advantage of simulated patients over real patients is that of allowing different candidates to be presented with a similar challenge, thereby reducing an important source of variability. They also have reliable availability and adaptability, which enables the reproduction of a wide range of clinical phenomena tailored to the student’s level of skill. In addition, they can simulate scenarios that may be distressing for a real patient, such as bereavement or terminal illness. Their use also removes the risk of injury or litigation while using real patients for examination especially in sensitive area of medicine like obstetrics and gynecology.

The validity of the use of SP in clinical practice has been proved by both direct and indirect means. In a double-blind study, simulated patients were substituted for real patients in the individual patient assessment of mock clinical examinations in psychiatry. Neither the examiners nor the students could detect the presence of simulated patients among the real patients. Indirect indicators of validity might include the fact that simulators are rarely distinguished from real patients.

Simulated patients are however expensive in terms of the time it takes to train and coach them in performing and understanding concepts, this could be very difficult in some fields like pediatrics where problems in very young children need to be simulated. The cost of paying professionals adds to the expense. However, the time efficiency of OSCE and its versatility makes the cost worthwhile. Recruitment and training of the SP is critical to the success of the OSCE. SP could be used not only for history taking and counselling, but also for eliciting physical findings that can be simulated, including aphasia, facial paralysis, hemiparetic gait, and hyperactive deep tendon reflexes.

Logistics of the examination process

Enough space is required for circuit running and to accommodate the various stations, equipment and materials for the exam. The manned stations should accommodate an examiner, a student and possibly the standardised patient and also allow for enough privacy of discussion so that the students performing other tasks are not distracted or disturbed. A large clinic room completely cleared could be ideal and may have further advantage of having clinic staff that will volunteer towards the execution of the examination thereby reducing cost.

The stations should be clearly marked and the direction of flow should also be unambiguous. It is good practice to have test run involving all candidates for that circuit so that they acquaint themselves to the direction of movement and the sound of the bell.


The OSCE style of clinical assessment, given its obvious advantages, especially in terms of objectivity, uniformity and versatility of clinical scenarios that can be assessed, shows superiority over traditional clinical assessment. It allows evaluation of clinical students at varying levels of training within a relatively short period, over a broad range of skills and issues. OSCE removes prejudice in examining students and allows all to go through the same scope and criteria for assessment. This has made it a worthwhile method in medical practice.

Fill in the Blank

The Blank is your mind..

The Thinker Man