Thứ Bảy, ngày 28 tháng 6 năm 2008

"If we keep doing things like this, the 'physics test incident' will happen again!"

Nguồn: Tuổi Trẻ Online

Thứ Ba, 24/06/2008, 20:03 (GMT+7)

http://www.tuoitre.com.vn/Tianyon/Index.aspx?ArticleID=265219&ChannelID=13

GS Lâm Quang Thiệp:

"Nếu không thay đổi cách làm, sự cố “đề vật lý” sẽ còn lặp lại!"


GS Lâm Quang Thiệp
TTO - Sai sót trong đáp án đề thi trắc nghiệm (TN) môn vật lý của kỳ thi tốt nghiệp THPT tuy đã được Bộ GD-ĐT giải quyết nhưng vẫn để lại dư âm là những ý kiến tranh luận chưa có hồi kết về “đáp án đúng” cùng những băn khoăn, lo lắng cho các đề thi TN trong kỳ thi tuyển sinh ĐH, CĐ sắp tới.

>> Đáp án đề thi vật lý là phù hợp
>> Bộ GD-ĐT chính thức điều chỉnh đáp án môn vật lý
>> Về đáp án đề thi vật lý: Nếu sai sẽ điều chỉnh, không ảnh hưởng đến tiến độ chấm thi

Với tư cách là một chuyên gia đã có nhiều năm nghiên cứu về phương thức thi TN, GS LÂM QUANG THIỆP - nguyên Vụ trưởng Vụ Đại học - đã có cuộc trao đổi với Tuổi Trẻ Online xung quanh chủ đề làm thế nào để nâng cao chất lượng, đảm bảo độ chính xác của các đề thi TN và khi xảy ra sự cố sẽ ảnh hưởng như thế nào đến kết quả thi của TS…

* Sau khi Bộ GD-ĐT công bố hướng dẫn chấm đối với đề thi môn vật lý trong kỳ thi tốt nghiệp THPT, đáp án của một câu TN đã gây nhiều tranh luận. Theo GS đó là một việc bình thường hay bất thường trong thi TN? Có thể xem một đề thi TN phải chỉnh sửa đáp án, khó xác định đáp án nào thật sự phù hợp như vậy là không đạt hay không?

- Tôi có theo dõi tranh luận về đề thi vật lý trên báo chí, nhiều giáo viên chuyên dạy môn vật lý phát hiện ra câu hỏi không phải chỉ có 1 mà là 2, 3 phương án đúng. Cuộc tranh luận rất lý thú và có nhiều ý kiến hợp lý. Điều này là bình thường và dễ xảy ra đối với một câu hỏi TN nói chung, vì người viết và người đọc duyệt một câu hỏi dù có chăm chú đến đâu cũng khó nhìn được toàn diện nên dễ sơ suất, chỉ có những người làm bài với các trình độ khác nhau, suy nghĩ về câu hỏi theo nhiều cách khác nhau mới phát hiện được mọi khía cạnh của nó.

Việc tranh luận về các câu hỏi TN theo kiểu này nếu diễn ra trong phạm vi một lớp học thì rất tốt, vì nó giúp học sinh hiểu kiến thức sâu hơn. Tuy nhiên, đối với một đề thi quốc gia thì sự có mặt những câu hỏi như vậy là bất thường. Và tất nhiên đề thi có các câu hỏi như vậy phải xem là không đạt, chính vì vậy nên Cục Khảo thí và Kiểm định chất lượng giáo dục mới phải tốn công sức để đưa ra các phương án xử lý và điều chỉnh cách chấm. Một đề thi quốc gia cho hàng trăm nghìn thí sinh làm phải đảm bảo tính tiêu chuẩn hóa cao nhằm đạt sự công bằng cao khi chấm thi, và các câu hỏi có vấn đề như vậy lẽ ra phải được loại trừ trước, không đưa vào đề chính thức.

* Vậy một đề thi như vậy có thể ảnh hưởng đến kết quả thi như thế nào, thưa GS?

- Đề thi TN là một cái thước đo, dùng để đo năng lực của thí sinh. Phép đo nào cũng có sai số, nhưng đối với các phép đo quan trọng như kỳ thi tốt nghiệp THPT hoặc kỳ thi tuyển sinh ĐH thì phải cố gắng tạo ra thước đo càng chính xác càng tốt, vì kết quả phép đo đó quyết định số phận của nhiều người. Đề thi chứa các câu hỏi có vấn đề như vậy là một thước đo không chính xác, hiển nhiên kết quả đo bị ảnh hưởng xấu.

Đối với đề thi vật lý lần này tôi chưa thể nói nó ảnh hưởng như thế nào đến mức chính xác của kết quả đo, vì riêng câu hỏi được tranh cãi đã lộ rõ nhược điểm, nhiều người phát hiện được là lời giải không đơn trị, còn các câu hỏi khác của đề chưa biết chất lượng thế nào. Chỉ sau khi phân tích thống kê từng câu hỏi bằng các phần mềm xây dựng trên khoa học TN thì mới có thể đánh giá chính xác chất lượng của từng câu hỏi và của đề thi.

Tôi đã phân tích tất cả các đề thi tuyển sinh ĐH năm ngoái, một số đề có nhiều câu hỏi chất lượng kém, tuy không thấy có tranh luận trên báo chí vì không thật lộ liễu, chẳng hạn đề tiếng Anh. Tôi rất mong có dịp trao đổi với Cục Khảo thí và Kiểm định chất lượng giáo dục về điều đó. Có thể ví một đề thi có nhiều câu hỏi chất lượng kém như một cái thước đo bằng dây cao su, kết quả đo sẽ rất kém chính xác. Tôi hi vọng đề thi vật lý vừa rồi không đến nỗi như vậy.

Để đề TN đạt chất lượng cao các nước tiên tiến đều tuân theo qui trình xây dựng đề TN nghiêm túc. Việt Nam ta hoàn toàn có khả năng xây dựng các đề TN tiêu chuẩn hóa có chất lượng cao, vì có nhiều chuyên gia hiểu biết về qui trình này và cũng có nơi đã xây dựng được phần mềm phân tích theo lý thuyết TN hiện đại.

Sở dĩ đề thi vật lý còn có câu hỏi kém chất lượng một cách lộ liễu như thế có lẽ vì Cục Khảo thí không tuân thủ theo đúng qui trình xây dựng các đề thi quốc gia theo công nghệ TN hiện đại. Trong đó, khâu quan trọng nhất là đảm bảo sao cho mọi câu hỏi trước khi đưa vào đề TN chính thức phải được thử nghiệm trên một số lượng nhất định đối tượng tương tự như thí sinh, trong trường hợp này là học sinh lớp 12.

* Theo GS, làm thế nào để trong đề thi không còn các câu hỏi kém chất lượng? Một trong những nguyên tắc của Bộ GD-ĐT đối với đề thi là không ra đề vào những phần còn đang gây tranh cãi, quá khó hoặc đánh đố TS. Đề thi TN có thể tuân thủ được những qui định này không?

- Trước đây khi góp ý cho bản trưng cầu ý kiến của Cục Khảo thí và Kiểm định chất lượng giáo dục về việc có nên sử dụng chủ yếu phương pháp TN hay không cho kỳ thi quốc gia, tôi đã trả lời là tôi không ủng hộ TN nếu làm đề TN không đúng qui trình, và ủng hộ TN nếu làm đúng. Chính công nghệ TN hiện nay cho phép đưa ra một qui trình xây dựng các đề thi TN có chất lượng cao, loại bỏ hoàn toàn hoặc phần lớn các câu hỏi kém chất lượng.

Một ưu điểm quan trọng của phương pháp TN là có thể sử dụng một qui trình dài ngày và có nhiều người tham gia để viết, thử nghiệm, chỉnh sửa, lựa chọn từng câu hỏi TN tốt nhằm tạo nên một đề TN chất lượng cao mà vẫn giữ được bí mật của đề thi. Kết quả thử nghiệm được phân tích bằng phần mềm TN hiện đại sẽ cho biết chất lượng của từng câu hỏi TN để chỉnh sửa hoặc loại bỏ hẳn không đưa vào đề thi chính thức.

* Nhưng thưa GS, đề thi mà chúng ta nói đến trong trường hợp cụ thể này là đề của những kỳ thi cực kỳ quan trọng như tốt nghiệp THPT, tuyển sinh ĐH, CĐ, được coi như tài liệu bí mật quốc gia. Có thể thử nghiệm trước mà vẫn giữ được bí mật đề thi quốc gia không?

- Cái hay của phương pháp TN chính là ở chỗ từng câu hỏi của đề TN đều được thử nghiệm trước khi đưa vào đề chính thức, nhưng đề thi vẫn giữ được bí mật. Nếu làm đề TN cho các kỳ thi đại trà theo đúng qui trình cần thiết thì vấn đề lộ đề hầu như không bao giờ xãy ra.

Thật vậy, đối với phương pháp tự luận muốn giữ được bí mật đề thi phải hạn chế tối đa số người tham gia làm đề, cách ly họ khi làm đề và thực hiện trong một thời gian rất ngắn, điều đó dễ dẫn đến sơ suất của đề thi. Ngược lại, phương pháp TN tuy cho phép triển khai xây dựng đề TN theo một quy trình dài ngày và thu hút sự đóng góp của nhiều người nhưng đề thi vẫn không hề bị lộ, bởi vì mỗi người chỉ liên quan đến một số rất ít câu hỏi TN, và họ không được lưu giữ bất kỳ câu hỏi nào.

Chẳng hạn, một số lớp 12 được chọn làm đại diện để thử nghiệm các đề kiểm tra nhỏ 10, 15 câu TN trong một kỳ kiểm tra của lớp học, học sinh làm bài phải tuân theo một nội qui nghiêm ngặt cấm sao chép đề, đề kiểm tra làm xong được thu lại toàn bộ. Tổ chức thử nghiệm bằng nhiều đề kiểm tra nhỏ như vậy ở nhiều trường thì toàn bộ các câu hỏi trong ngân hàng câu hỏi TN sẽ được thử nghiệm. Qui trình thử nghiệm như vậy đảm bảo không làm lộ các câu TN.

Giả sử một học sinh nào đó tham gia thử nghiệm có nhớ được một vài câu hỏi thì xác xuất để các câu đó rơi vào đề thi chính thức cũng rất bé. Vì đến giờ phút cuối cùng trước kỳ thi một đề TN năm bảy chục câu hỏi mới được thiết kế tự động và in ra từ một ngân hàng gồm hàng nghìn câu hỏi. Cũng chính nhờ qui trình như vậy mà ngay trong khi ở nước ta chưa có nhiều chuyên gia viết câu TN giỏi thì vẫn có thể chọn lựa được các câu có chất lượng cao để làm một đề TN tốt cho kỳ thi chính thức.

Tôi rất mong những người trực tiếp quản lý việc làm đề TN cho các kỳ thi quốc gia sắp tới thực hiện việc làm đề TN đúng qui trình cần thiết. Bởi vì nếu không làm đúng như vậy thì sẽ không tận dụng một ưu điểm lớn của phương pháp TN, và chắc chắn đề TN chất lượng sẽ kém, và những trường hợp tương tự như đề vật lý vừa qua sẽ còn lặp lại!

THANH HÀ thực hiện

Thứ Sáu, ngày 27 tháng 6 năm 2008

Adopt, adapt, or develop?

One common question faced by teachers and school administrators is whether to use a ready-made test, make some small changes, or to build completely new tests. In order to make the right decision, we need to consider practical as well as theoretical issues. Chapter 2 in Brown will provide you with information on what to consider when you make such decisions.

Before you read, think about your own situation. Do you always use ready-made tests for important testing events, and if so, what kind? Or do you sometimes adopt, or develop new tests? And how do you decide what to change? What criteria do you use in developing new tests?

Phuong Anh

Thứ Sáu, ngày 20 tháng 6 năm 2008

Language Testing Course Syllabus

Language Testing

Lecturers:
- Vu Thi Phuong Anh (PhD), email: anhvukim@yahoo.com or vtpanhgmail.com
- Nguyen Bich Hanh (MA), email: bichhanh56@yahoo.com


Course objectives
- to provide ss with an understanding of the basic distinction between two families of tests, ie norm-referenced and criterion-referenced tests, and their respective characteristics
- to equip language program administrators and classroom teachers with practical and useful tools for making correct decisions in their jobs such as admission, placement, certification, making comparisons between students, judging the effectiveness of teaching techniques, materials, curriculum, or assessing learners' growth.

Required text
Brown, J, D. (2005) Testing in Language Programs - Comprehensive Guide to English Language Assessment (2nd ed). Singapore, McGrawHill.

Recommended text
Hughes, A. (2003) Testing for Language Teachers (2nd ed). Cambridge, CUP.

Week 1 - Types and Uses of Language Tests
Week 2 - Adopting, Adapting and Developing Language Tests
Week 3 - Developing Good Quality Language Test Items
Week 4 - Item Analysis in Language Testing
Week 5 - Describing Language Test Results
Week 6 - Interpreting Language Test Scores
Week 7 - Correlation in Language Testing
Week 8 - Language Test Reliability
Week 9 - Language Test Dependability
Week 10 - Language Test Validity
Week 11 - Language Testing in Reality

Thứ Hai, ngày 16 tháng 6 năm 2008

ADPRIMA - on the Internet since 1997
always striving for the best education information


Measurement, Assessment, and Evaluation in Education
Dr. Bob Kizlik

Updated May 29, 2008


In my years of teaching undergraduate courses, I was continuously reminded each semester that many education students who had the requisite course in "educational tests and measurements" or a course with a similar title as part of their professional preparation, had confusing ideas about fundamental differences in terms such as measurement, assessment and evaluation as they are used in education. When I asked the question, "what is the difference between assessment and evaluation," I usually got a lot of blank stares. Yet, it seems that understanding the differences between measurement, assessment, and evaluation is fundamental to the knowledge base of teaching, and certainly to the processes employed in the education of future teachers.

In many places on the ADPRIMA website the phrase, "Anything not understood in more than one way is not understood at all" appears after some explanation or body of information. That phrase is, in my opinion, a fundamental idea of what should be a cornerstone of all teacher education. Students often struggle with describing or explaining what it means to "understand" something that they say they understand. I believe in courses in educational tests and measurements, that "understanding" has often been inferred from responses on multiple-choice tests or solving statistical problems. A semester later, when questioned about very fundamental ideas in statistics, measurement, assessment and evaluation, the students I had seemingly forgot most, if not all of what they "learned."

Measurement, assessment, and evaluation mean very different things, and yet most of my students are unable to adequately explain the differences. So, in keeping with the ADPRIMA approach to explaining things in as straightforward and meaningful a way as possible, here are what I think are useful descriptions of these three fundamental terms. These are personal opinions, but they have worked for me for many years. They have operational utility.

Measurement refers to the process by which the attributes or dimensions of some physical object are determined. One exception seems to be in the use of the word measure in determining the IQ of a person. The phrase, "this test measures IQ" is commonly used. Measuring such things as attitudes or preferences also applies. However, when we measure, we generally use some standard instrument to determine how big, tall, heavy, voluminous, hot, cold, fast, or straight something actually is. Standard instruments refer to instruments such as rulers, scales, thermometers, pressure gauges, etc. We measure to obtain information about what is. Such information may or may not be useful, depending on the accuracy of the instruments we use, and our skill at using them. There are few such instruments in the social sciences that approach the validity and reliability of say a 12" ruler. We measure how big a classroom is in terms of square feet, we measure the temperature of the room by using a thermometer, and we use Ohm meters to determine the voltage, amperage, and resistance in a circuit. In all of these examples, we are not assessing anything; we are simply collecting information relative to some established rule or standard. Assessment is therefore quite different from measurement, and has uses that suggest very different purposes. When used in a learning objective, the definition provided on the ADPRIMA for the behavioral verb measure is: To apply a standard scale or measuring device to an object, series of objects, events, or conditions, according to practices accepted by those who are skilled in the use of the device or scale.

Assessment is a process by which information is obtained relative to some known objective or goal. Assessment is a broad term that includes testing. A test is a special form of assessment. Tests are assessments made under contrived circumstances especially so that they may be administered. In other words, all tests are assessments, but not all assessments are tests. We test at the end of a lesson or unit. We assess progress at the end of a school year through testing, and we assess verbal and quantitative skills through such instruments as the SAT and GRE. Whether implicit or explicit, assessment is most usefully connected to some goal or objective for which the assessment is designed. A test or assessment yields information relative to an objective or goal. In that sense, we test or assess to determine whether or not an objective or goal has been obtained. Assessment of skill attainment is rather straightforward. Either the skill exists at some acceptable level or it doesn’t. Skills are readily demonstrable. Assessment of understanding is much more difficult and complex. Skills can be practiced; understandings cannot. We can assess a person’s knowledge in a variety of ways, but there is always a leap, an inference that we make about what a person does in relation to what it signifies about what he knows. In the section on this site on behavioral verbs, to assess means To stipulate the conditions by which the behavior specified in an objective may be ascertained. Such stipulations are usually in the form of written descriptions.

Evaluation is perhaps the most complex and least understood of the terms. Inherent in the idea of evaluation is "value." When we evaluate, what we are doing is engaging in some process that is designed to provide information that will help us make a judgment about a given situation. Generally, any evaluation process requires information about the situation in question. A situation is an umbrella term that takes into account such ideas as objectives, goals, standards, procedures, and so on. When we evaluate, we are saying that the process will yield information regarding the worthiness, appropriateness, goodness, validity, legality, etc., of something for which a reliable measurement or assessment has been made. For example, I often ask my students if they wanted to determine the temperature of the classroom they would need to get a thermometer and take several readings at different spots, and perhaps average the readings. That is simple measuring. The average temperature tells us nothing about whether or not it is appropriate for learning. In order to do that, students would have to be polled in some reliable and valid way. That polling process is what evaluation is all about. A classroom average temperature of 75 degrees is simply information. It is the context of the temperature for a particular purpose that provides the criteria for evaluation. A temperature of 75 degrees may not be very good for some students, while for others, it is ideal for learning. We evaluate every day. Teachers, in particular, are constantly evaluating students, and such evaluations are usually done in the context of comparisons between what was intended (learning, progress, behavior) and what was obtained. When used in a learning objective, the definition provided on the ADPRIMA site for the behavioral verb evaluate is: To classify objects, situations, people, conditions, etc., according to defined criteria of quality. Indication of quality must be given in the defined criteria of each class category. Evaluation differs from general classification only in this respect.

To sum up, we measure distance, we assess learning, and we evaluate results in terms of some set of criteria. These three terms are certainly connected, but it is useful to think of them as separate but connected ideas and processes.

Here is a great link that offer different ideas about these three terms, with well-written explanations. Unfortunately, most information on the Internet concerning this topic amounts to little more than advertisements for services.

ASSESSMENT, MEASUREMENT, EVALUATION & RESEARCH

Another resource with a wide variety of information on many related topics is Development Gateway.

Distance Education Aptitude and Readiness Scale (DEARS)

"Anything not understood in more than one way is not understood at all."

Okay, now for something to read that might give you a chill or two.... click here for my novel, What Waits Within

Online Education for Teachers – Advance Your Career

I would like to thank all who order Lesson Planning: From Writing Objectives to Selecting Instructional Programs, as well as books, music, electronics, DVDs, software, and household items from AMAZON.COM through ADPRIMA. By doing so, you help support the operation and maintenance of this site.

Bob Kizlik

ADPRIMA - bridging the gap between theory and practice

ADPRIMA - bridging the gap between theory and practice

Click for the Best of ADPRIMA Lesson Planning Help Monster Learning Skills Teaching Methods Ideas for New and Future Teachers ADPRIMA site putpose What Social Studies is For Classroom Management Mistakes Lesson Planning Ideas Criteria for Improvement Education Humor Home Schooling How to Study Measurement, Assessment and Evaluation Tips on Becoming a Teacher Examples of Behavioral Verbs How to Write Objectives Instructional Grouping Assertive Discipline Home Schooling Classroom Management

1997

2008

ADPRIMA Head green

Experts in Language Assessment

Computer-based testing

Cambridge ESOL delivers exams to more than 2,000 centres in over 130 countries. We are constantly seeking ways to enhance our service to our exam centres and the 2 million candidates who take our tests.

Computer-based testing (CBT) offers additional flexibility for candidates and centres: choice of the mode of the exam, more frequent sessions, reduced lead-in and turnaround times and enhanced security.

Now there are two ways to offer Cambridge exams

Computers are now an established part of our daily lives. Their use in education has become widespread and many students around the world have excellent keyboard skills and gain computer knowledge from an early age.

The CBT versions follow the same test format as the paper-based tests. The speaking part of the test is completed in exactly the same way as in the paper-based version — through face-to-face assessment.

Centres do not have to offer the computer-based versions. We offer paper-based tests alongside the new CBT.

Same format, same questions, same certificate

The CBT includes the same question types which are marked on the same criteria. The certificate awarded and its value are identical to the paper-based test.

Safe, efficient method of language assessment

Cambridge ESOL's CBT has been carefully designed and extensively trialled to ensure the same high standards of testing as the paper-based tests. They provide a safe, secure and efficient way of assessing candidates' English language skills and knowledge.

For information of various CBT trials, see these editions of Research Notes:

Technology advances

Recent advances in technology and online security make CBT accessible in an increasing number of locations. Its rise in popularity as another way to take tests has been rapid.

The software system we use runs on a standard computer network. A number of candidate workstations are linked to an administrator workstation. The test materials are downloaded directly by the centre using a highly secure internet-based system. Once the candidates have completed the exams, the exam responses are uploaded directly through the Cambridge Connect system to Cambridge ESOL for assessment. This takes away the need for secure postal returns.

There is a quick online tutorial for candidates to use before they begin their exams.

Easy, standard web functions

CBT uses standard web navigation that is clear and easy to follow. The quick and helpful tutorial session at the start of the exam ensures candidates can spend a few moments making sure they know how the test works and what they need to do.

Making exams easy to run

There are no specialist requirements. Exam centres using the Cambridge Connect software system that runs the computer-based tests do not need to have a dedicated computer-testing suite.

The system can be installed and operated on a standard IT infrastructure with internet connection at sites such as universities, colleges or schools. Cambridge Connect can also be installed on laptops on a wireless network.

Centre staff are trained and fully supported in the use of Cambridge Connect.

Because the CBT exams are paperless, so is the administration.

Benefits

  • Additional exam sessions
    The flexibility of CBT means that more exam sessions can be offered to candidates. Centres can respond to demand and additional exams can be run at different times of the year to the paper-based tests. This will suit candidates who are ready to take the exam at different times of the year.
  • Faster turnaround of results
    Exam results are delivered quickly and securely to candidates who sign up to use the Results Online service. They are available to both the centre and candidates within three to four weeks of the exam taking place. With CBT, candidates can enter for an exam, take the exam and receive their results within just five weeks.
  • Making exams more accessible
    Our research shows that many candidates prefer taking the Writing and Listening parts on a computer, especially younger candidates who are used to working with keyboards and headphones and who are comfortable taking exams using technology that is familiar to them.
  • No special preparation is needed
    Teachers do not need to do any special preparation with their students and they do not need any special technical knowledge. Exam preparation remains the same as there is no change to the curriculum or the teaching approach.

More information on our exams and how to find CBT dates.

Experts in Language Assessment

Pretesting

What are pretests?

As part of Cambridge ESOL's commitment to accuracy and quality, we submit all the materials in our exams to a number of procedures to ensure they are accurate and reliable. One of these is pretesting.

Testing materials before they are used in exams allows us to make certain our exams are accurate and fair.

How can they help teachers and students?

Pretests give students a chance to practise taking a Cambridge ESOL exam using genuine questions under exam conditions.

After taking the Reading, Listening and Use of English papers, the pretesting students are given scores. Writing papers are marked by genuine Cambridge ESOL examiners and candidates receive information about how they performed in the writing test (note: Pretests are not available for Speaking papers.)

This helps students to know which areas they need most practice in, and gives them experience and confidence in taking tests. For teachers, it helps highlight areas where their students might need more help.

How do they help Cambridge ESOL?

Pretests are an essential part of the exam production process. Statistical data is obtained for each task, which allows us to construct our exams to a prescribed level of difficulty. This ensures, for example, that an FCE Reading paper produced for June 2008 is at the same level of difficulty as the same exam produced in December 2008.

When can pretests be taken?

Exams with fixed dates(such as PET, KET and FCE) usually have a pretest window of about three weeks. Exams which are 'on demand' (such as IELTS) usually have an open pretest date. The Pretesting Calendar (PDF 44Kb) gives the pretest dates available for this year as well as information about the length of the pretest.

It is best if students take pretests about six to eight weeks before their real test. This means that they are nearly ready for the exam, and so are at the right level. It also means that teachers will get the scores back in time to focus on any particular language areas in need of practice.

What else should I know?

  1. All test papers and DHL despatch costs are paid by Cambridge ESOL.
  2. Pretest papers are marked in Cambridge, and scores are returned to schools within three weeks. (Writing papers are sent to examiners, so these scores and reports may take a little longer.)
  3. Students take the pretests under exam conditions: pretest centres simulate the real exam, so that students not only experience the kind of questions they will face in their live exam, but also complete the answer sheets in a 'test-like' environment.
  4. After the pretest, all materials must be sent back to Cambridge ESOL. Materials cannot be kept for classroom practice.

How can we become involved?

To take part in pretesting, your school must either be a Cambridge ESOL test centre or be a 'Pretest Approved Institution'. If your school is not yet approved, your exam administrator needs to complete the Pretesting Institution Pre-Approval Form (PDF 472Kb), and this needs to be signed by the Local Secretary of the centre at which your students will take the real exam. Once the form is returned to Cambridge ESOL and has been checked and approved, your school may then be invited to participate in pretesting one or more of the Cambridge ESOL exams that your students sit.

How can I find out more?

You can find more information about the purpose and practice of pretesting from the Pretesting Guide (PDF 40Kb).

For more information on 'pretesting windows' as well as information about the length of pretests, see the Pretesting Calendar (44Kb).

If you have any further questions about participation in Cambridge ESOL pretesting, please contact: pretesting@CambridgeESOL.org

Please note: pretests are not available for YLE or for Speaking papers.

1Measuring readiness for simplified material: a test of the first 1,000 words of English. In Simplification: Theory and Application ed. M.L.Tickoo, RELC Anthology Series No 31, 1993, pp 193-203.


Measuring Readiness for Simplified Material: A Test of the First 1,000 Words of English


Paul Nation

Victoria University of Wellington



The first 1,000 words of English are the essential basis for simplified teaching material. This article describes the need for a test of these words and the difficulties in making one. It contains two equivalent forms of a test along with instructions on how to use it and how to apply the information gained from it.


The importance of high frequency words


Frequency studies of English have shown that the return for learning the high frequency words is very great. Generally these high frequency words are considered to be the most frequent 2,000 words (West, 1953) although some research indicates that the return for learning vocabulary drops off rather quickly after the first 1,500 words (Engels, 1968; Hwang, 1989). The return for learning is the coverage of text, spoken or written, that knowledge of the words provides. For example, Schonell et al (1956) found that the most frequent 1,000 words in spoken English provided coverage of 94% of the running words in informal conversation. Similarly, figures from the frequency count by Carroll et al (1971) indicate that the first 1,000 words of English cover 74% of written text. Note that coverage refers to running words where each recurrence of a word is counted as additional coverage. Thus, knowing the word the gives much less than 10% coverage of written text because this word occurs so frequently. Clearly the return for learning the first 1,000 words of English is very high. By comparison, the second most frequent 1,000 words of English provides coverage of only 7% of written text.


It should not be thought that the first 1,000 words is made up mainly of words like the, and, of, they, and because. These function words make up fewer than 150 of the 1,000 words.


Lists containing the first 1,000 words


There are several lists available of the most frequent words of English. These include frequency counts (Carroll et al, 1971; Francis & Kucera, 1982; Thorndike & Lorge, 1944), and combinations of various lists (Hindmarsh, 1980; Barnard & Brown, in Nation 1984). The list chosen for this test is West's General Service List of English Words (1953). The General Service List has been used as a basis for many series of graded readers, and this provides an advantage in using it for the test. This list is rather old, based on work done in the 1930s and 1940s. However it still remains the most useful one available as the relative frequency of various meanings of each word is given. When making the tests included here the words chosen were checked against the Carroll et al count to make sure that they occurred in the first 2,000 words of that count.


Difficulties in testing the first 1,000 words


There are several difficulties involved with making a test of the first 1,000 words. The first is such a test may be used with classes of learners who speak different first languages and thus translation is not a practical approach. Second, there is the likelihood that some learners will have poor reading skills and thus the test needs to be able to be given orally if necessary. These two factors resulted in the choice of a true/false format. Multiple-choice was not possible because it is impractical in an oral form. One disadvantage of true/false is the possible strong effect of guessing, although research by Ebel (1979) indicates that this is not as likely as it seems. In an attempt to overcome possible effects of guessing, three types of responses were suggested in the instructions (True, Not true, Do not understand), and each word was tested twice, once in each version of the test. Where an item is tested twice, there are four possible sets of answers, namely both correct, both wrong, the first item correct and the second wrong, the first item wrong and the second correct. There is thus only a one in four chance of correctly guessing both items testing the same word. This is the same chance as with four-item multiple choice tests. So if the teacher feels that learners are making wild guesses, both forms of the test should be given and a mark given only when both items testing a word are correct.


A third difficulty is that the contexts for the tested words must not cause too many problems for the learners. When making the test, an attempt was made to ensure that the context words were of higher frequency than the tested word. This was not always possible for some of the words and thus a few words have some context words of the same frequency. There are no items with contexts of lower frequency. Occasionally a picture was used to avoid a lower frequency word, for example:


This can keep people away from your house.




Dog is a lower frequency word than the test word keep and so a picture was used instead of saying:

A dog can keep people away from your house.

This frequency restriction on the context was the most difficult constraint to overcome when making the test.


A fourth difficulty is that most of the high frequency words have several meanings. In the test only the most frequent meaning was tested. This was found by referring to West (1953) and the COBUILD dictionary (Sinclair, 1987).


A fifth difficulty is that using true/false items where the judgement is based on general knowledge allows other factors besides vocabulary knowledge to play a part. Some items where this may occur include:

Some children call their mother Mama.

You can go by road from London to New York.

Each society has the same rules.

Some problems of this type were removed as a result of trialling the test. There is value, however, in having the words in context in that the context can help in accessing the meaning of the word as well as limiting the meaning that is being tested. The disadvantage of drawing on general knowledge is not as great as the advantage of testing in use rather than by definitions.


A sixth difficulty is the grammatical complexity of the context of the tested words. For example, several of the highest frequency items are tested in the two-clause pattern "When _____, _____". This was unavoidable. Trialling of the test helped find some items where this caused too much difficulty and these were changed.


Using the test


Usually one form of the test (40 items) should be enough to get a useful result. When the test is given orally, the learners will need to be able to see the accompanying pictures. It is probably best if the test is given orally to one learner at a time. The teacher can repeat the items to the learners as many times as is needed. If the teacher knows the learner's first language then also requiring a translation would be a useful check. It is possible to find which word is tested by comparing the two items in the two forms of the test as both forms contain the tested words in exactly the same order. The ordering is based on frequency of occurrence according to West (1953) with the most frequent word (time) occurring first.


Only content words (nouns, verbs, adjectives, adverbs) are tested. To find what proportion of the first 1,000 words is known, multiply the total score on each version of the test (40 items) by 2.5. Multiplying by 2.5 assumes that the learners already know the same proportion of function words.


Applying the results


The results of the test can be used to help diagnose areas of weakness, set learning goals and plan a vocabulary programme, measure vocabulary growth, and assign graded reading. Let us look at each of these in turn.


Diagnosis: The test can be used to help answer this question. Is the learner's poor performance in reading or listening a result of inadequate vocabulary knowledge? Some learners, particularly those for whom English is a foreign language, have difficulty understanding spoken English. This could be because they do not know enough vocabulary or simply because they have learned English through reading and have not had enough contact with spoken English. Giving the vocabulary test in its written form should help the teacher see where the problem lies. With such learners it would be interesting to give one form of the test orally and one form through reading to see what the difference was.


Similarly, learners who have had a lot of contact with spoken English may be poor at reading and doing the test orally should reveal their vocabulary knowledge.


Set learning goals: The first 1,000 words of English are essential for all learners who wish to use the language. It is thus very important that teachers know what vocabulary knowledge their learners have and are aware of how they can systematically help them to increase this knowledge. If learners do not know all of the first 1,000 words of English it is well worth ensuring that they have the opportunity to learn those that they do not know. Nation (1990) looks at this in detail over the four skills of listening, speaking, reading and writing. Ways of doing this include substantial graded reading, direct vocabulary teaching, doing vocabulary learning exercises, and systematically providing a vocabulary focus in language learning activities. If learners' vocabulary is larger than 1,000 words, the Vocabulary Levels Test (Nation, 1990) can be used.


Measure vocabulary growth: The two equivalent forms of the 1,000 word test allow the teacher to check how much learners' vocabulary has increased over several months. This use should be treated with caution as each test has only forty items and thus the confidence interval would be large if we were measuring an individual's increase in vocabulary size. When both forms of the test were administered to the same group of learners, it was found that the most difficult items in test A tested words that were also in the most difficult items in test B. These words were ancient, stream, remain, wide, and at least. It was also found that two-thirds of the learners gained scores on tests A and B that were within two marks or less of each other. Only one of the fifteen learners tested had scores which were more than four marks different.


Assign graded reading: Various series of graded readers have several stages of readers within the first 1,000 words of English. Longman Structural Readers, for example, have books written at the 300-word stage, the 500-word stage, the 750-word stage, and the 1,100-word stage. The way that the Longman series divides the words into stages does not correspond exactly to frequency (and thus to the ordering of items in the vocabulary test) but there is rough agreement. For example, the first 10 items in the vocabulary test are made up of one test word from Longman Stage 1, six from Stage 2, and three from Stage 3. Because the agreement is rough, it is better to use learners' total scores on the test to decide what stage of graded reader they should be reading. If their vocabulary score on a 40-item test is less than 10 they should be reading at Stage 1, from 11 to 20 Stage 2, from 21 to 30 Stage 3, and above 30 Stage 4. Graded reading is an excellent way of increasing vocabulary. By reading three or more readers at one stage learners are likely to meet all of the vocabulary at that stage. Having mastered the vocabulary of that stage, they can go to the next stage without needing extra preparation for the new vocabulary (Wodinsky & Nation, 1988).


The content of test items


The items in a test which is not based on a particular piece of content knowledge inevitably reveal the personality of the test maker. Looking back over the items I see that some reflect my philosophical attitudes, "We can be sure that one day we will die" (Some learners seemed convinced that this was not true.). This same sense of inevitability is revealed in "Day follows night and night follows day" and "Your child will be a girl or a boy". I also see my jaundiced attitudes to children after having raised a family, "It is easy for children to remain still" (clearly not true), "Most children go to school at night" (perhaps that should be true), "A child has a lot of power" (true or not true? Unfortunately omitted). In the earlier versions of the test there was also a strong moral tone, "It is good to keep a promise", "It is not good to try hard", "You must look to find the way". However, although the learners did not seem to have trouble with these items, colleagues convinced me that these were culture bound and not in keeping with the tone of the last part of the twentieth century. I reluctantly changed some of them. It is after all easier to change test items than it is to change colleagues. After all, "A society is made of people living together".


References


Carroll, J.B., P. Davies and B. Richman 1971. The American Heritage Word Frequency Book. New York: American Heritage Publishing Co.


Ebel, R. L. 1979. Essentials of Educational Measurement. (3rd ed.) Englewood Cliffs: Prentice Hall.


Engels, L. K. l968. The fallacy of word counts. IRAL 6,3: 2l3-23l.


Francis, W. Nelson and Kucera, Henry 1982. Frequency Analysis of English Usage. Boston: Houghton Mifflin Company.


Hindmarsh, R. l980. Cambridge English Lexicon. Cambridge: Cambridge University Press.


Hwang Kyong Ho 1989. Reading newspapers for the improvement of vocabulary and reading skills. Unpublished M.A. thesis, Victoria University of Wellington.


Nation, I.S.P. 1984. Vocabulary Lists. English Language Institute Occasional Publication No.12, Victoria University of Wellington.


Nation, I.S.P. 1990. Teaching and Learning Vocabulary. New York: Newbury House.


Schonell, F. J., I.G. Meddleton, and B.A. Shaw l956. A study of the oral vocabulary of adults. Brisbane: University of Queensland Press.


Sinclair, J. (ed.) l987. Collins Cobuild English Language Dictionary. London: Collins.


Thorndike, E. L. and I. Lorge l944. The Teacher's Word Book of 30,000 Words. Teachers College, Columbia University.


West, Michael l953. A General Service List of English Words. London: Longman, Green & Co.


Wodinsky, M. and I.S.P. Nation. 1988. Learning from graded readers. Reading in a Foreign Language 5,1: 155-161.

Language Testing Blog: Language Testing Blog: Welcome message

Language Testing Blog: Language Testing Blog: Welcome message

Measuring readiness for simplified material: a test of the first 1,000 words of English - Paul Nation 1993

Measuring readiness for simplified material: a test of the first 1,000 words of English.

In Simplification: Theory and Application ed. M.L.Tickoo, RELC Anthology Series No 31, 1993, pp 193-203.

Measuring Readiness for Simplified Material: A Test of the First 1,000 Words of English

Paul Nation
Victoria University of Wellington


The first 1,000 words of English are the essential basis for simplified teaching material. This article describes the need for a test of these words and the difficulties in making one. It contains two equivalent forms of a test along with instructions on how to use it and how to apply the information gained from it.

The importance of high frequency words

Frequency studies of English have shown that the return for learning the high frequency words is very great. Generally these high frequency words are considered to be the most frequent 2,000 words (West, 1953) although some research indicates that the return for learning vocabulary drops off rather quickly after the first 1,500 words (Engels, 1968; Hwang, 1989). The return for learning is the coverage of text, spoken or written, that knowledge of the words provides. For example, Schonell et al (1956) found that the most frequent 1,000 words in spoken English provided coverage of 94% of the running words in informal conversation. Similarly, figures from the frequency count by Carroll et al (1971) indicate that the first 1,000 words of English cover 74% of written text. Note that coverage refers to running words where each recurrence of a word is counted as additional coverage. Thus, knowing the word the gives much less than 10% coverage of written text because this word occurs so frequently. Clearly the return for learning the first 1,000 words of English is very high. By comparison, the second most frequent 1,000 words of English provides coverage of only 7% of written text.

It should not be thought that the first 1,000 words is made up mainly of words like the, and, of, they, and because. These function words make up fewer than 150 of the 1,000 words.

Lists containing the first 1,000 words

There are several lists available of the most frequent words of English. These include frequency counts (Carroll et al, 1971; Francis & Kucera, 1982; Thorndike & Lorge, 1944), and combinations of various lists (Hindmarsh, 1980; Barnard & Brown, in Nation 1984). The list chosen for this test is West's General Service List of English Words (1953). The General Service List has been used as a basis for many series of graded readers, and this provides an advantage in using it for the test. This list is rather old, based on work done in the 1930s and 1940s. However it still remains the most useful one available as the relative frequency of various meanings of each word is given. When making the tests included here the words chosen were checked against the Carroll et al count to make sure that they occurred in the first 2,000 words of that count.

Difficulties in testing the first 1,000 words

There are several difficulties involved with making a test of the first 1,000 words. The first is such a test may be used with classes of learners who speak different first languages and thus translation is not a practical approach. Second, there is the likelihood that some learners will have poor reading skills and thus the test needs to be able to be given orally if necessary. These two factors resulted in the choice of a true/false format. Multiple-choice was not possible because it is impractical in an oral form. One disadvantage of true/false is the possible strong effect of guessing, although research by Ebel (1979) indicates that this is not as likely as it seems. In an attempt to overcome possible effects of guessing, three types of responses were suggested in the instructions (True, Not true, Do not understand), and each word was tested twice, once in each version of the test. Where an item is tested twice, there are four possible sets of answers, namely both correct, both wrong, the first item correct and the second wrong, the first item wrong and the second correct. There is thus only a one in four chance of correctly guessing both items testing the same word. This is the same chance as with four-item multiple choice tests. So if the teacher feels that learners are making wild guesses, both forms of the test should be given and a mark given only when both items testing a word are correct.

A third difficulty is that the contexts for the tested words must not cause too many problems for the learners. When making the test, an attempt was made to ensure that the context words were of higher frequency than the tested word. This was not always possible for some of the words and thus a few words have some context words of the same frequency. There are no items with contexts of lower frequency. Occasionally a picture was used to avoid a lower frequency word, for example:
This can keep people away from your house.



Dog is a lower frequency word than the test word keep and so a picture was used instead of saying:
A dog can keep people away from your house.
This frequency restriction on the context was the most difficult constraint to overcome when making the test.

A fourth difficulty is that most of the high frequency words have several meanings. In the test only the most frequent meaning was tested. This was found by referring to West (1953) and the COBUILD dictionary (Sinclair, 1987).

A fifth difficulty is that using true/false items where the judgement is based on general knowledge allows other factors besides vocabulary knowledge to play a part. Some items where this may occur include:
Some children call their mother Mama.
You can go by road from London to New York.
Each society has the same rules.
Some problems of this type were removed as a result of trialling the test. There is value, however, in having the words in context in that the context can help in accessing the meaning of the word as well as limiting the meaning that is being tested. The disadvantage of drawing on general knowledge is not as great as the advantage of testing in use rather than by definitions.

A sixth difficulty is the grammatical complexity of the context of the tested words. For example, several of the highest frequency items are tested in the two-clause pattern "When _____, _____". This was unavoidable. Trialling of the test helped find some items where this caused too much difficulty and these were changed.

Using the test

Usually one form of the test (40 items) should be enough to get a useful result. When the test is given orally, the learners will need to be able to see the accompanying pictures. It is probably best if the test is given orally to one learner at a time. The teacher can repeat the items to the learners as many times as is needed. If the teacher knows the learner's first language then also requiring a translation would be a useful check. It is possible to find which word is tested by comparing the two items in the two forms of the test as both forms contain the tested words in exactly the same order. The ordering is based on frequency of occurrence according to West (1953) with the most frequent word (time) occurring first.

Only content words (nouns, verbs, adjectives, adverbs) are tested. To find what proportion of the first 1,000 words is known, multiply the total score on each version of the test (40 items) by 2.5. Multiplying by 2.5 assumes that the learners already know the same proportion of function words.

Applying the results

The results of the test can be used to help diagnose areas of weakness, set learning goals and plan a vocabulary programme, measure vocabulary growth, and assign graded reading. Let us look at each of these in turn.

Diagnosis: The test can be used to help answer this question. Is the learner's poor performance in reading or listening a result of inadequate vocabulary knowledge? Some learners, particularly those for whom English is a foreign language, have difficulty understanding spoken English. This could be because they do not know enough vocabulary or simply because they have learned English through reading and have not had enough contact with spoken English. Giving the vocabulary test in its written form should help the teacher see where the problem lies. With such learners it would be interesting to give one form of the test orally and one form through reading to see what the difference was.

Similarly, learners who have had a lot of contact with spoken English may be poor at reading and doing the test orally should reveal their vocabulary knowledge.

Set learning goals: The first 1,000 words of English are essential for all learners who wish to use the language. It is thus very important that teachers know what vocabulary knowledge their learners have and are aware of how they can systematically help them to increase this knowledge. If learners do not know all of the first 1,000 words of English it is well worth ensuring that they have the opportunity to learn those that they do not know. Nation (1990) looks at this in detail over the four skills of listening, speaking, reading and writing. Ways of doing this include substantial graded reading, direct vocabulary teaching, doing vocabulary learning exercises, and systematically providing a vocabulary focus in language learning activities. If learners' vocabulary is larger than 1,000 words, the Vocabulary Levels Test (Nation, 1990) can be used.

Measure vocabulary growth: The two equivalent forms of the 1,000 word test allow the teacher to check how much learners' vocabulary has increased over several months. This use should be treated with caution as each test has only forty items and thus the confidence interval would be large if we were measuring an individual's increase in vocabulary size. When both forms of the test were administered to the same group of learners, it was found that the most difficult items in test A tested words that were also in the most difficult items in test B. These words were ancient, stream, remain, wide, and at least. It was also found that two-thirds of the learners gained scores on tests A and B that were within two marks or less of each other. Only one of the fifteen learners tested had scores which were more than four marks different.

Assign graded reading: Various series of graded readers have several stages of readers within the first 1,000 words of English. Longman Structural Readers, for example, have books written at the 300-word stage, the 500-word stage, the 750-word stage, and the 1,100-word stage. The way that the Longman series divides the words into stages does not correspond exactly to frequency (and thus to the ordering of items in the vocabulary test) but there is rough agreement. For example, the first 10 items in the vocabulary test are made up of one test word from Longman Stage 1, six from Stage 2, and three from Stage 3. Because the agreement is rough, it is better to use learners' total scores on the test to decide what stage of graded reader they should be reading. If their vocabulary score on a 40-item test is less than 10 they should be reading at Stage 1, from 11 to 20 Stage 2, from 21 to 30 Stage 3, and above 30 Stage 4. Graded reading is an excellent way of increasing vocabulary. By reading three or more readers at one stage learners are likely to meet all of the vocabulary at that stage. Having mastered the vocabulary of that stage, they can go to the next stage without needing extra preparation for the new vocabulary (Wodinsky & Nation, 1988).

The content of test items

The items in a test which is not based on a particular piece of content knowledge inevitably reveal the personality of the test maker. Looking back over the items I see that some reflect my philosophical attitudes, "We can be sure that one day we will die" (Some learners seemed convinced that this was not true.). This same sense of inevitability is revealed in "Day follows night and night follows day" and "Your child will be a girl or a boy". I also see my jaundiced attitudes to children after having raised a family, "It is easy for children to remain still" (clearly not true), "Most children go to school at night" (perhaps that should be true), "A child has a lot of power" (true or not true? Unfortunately omitted). In the earlier versions of the test there was also a strong moral tone, "It is good to keep a promise", "It is not good to try hard", "You must look to find the way". However, although the learners did not seem to have trouble with these items, colleagues convinced me that these were culture bound and not in keeping with the tone of the last part of the twentieth century. I reluctantly changed some of them. It is after all easier to change test items than it is to change colleagues. After all, "A society is made of people living together".

References

Carroll, J.B., P. Davies and B. Richman 1971. The American Heritage Word Frequency Book. New York: American Heritage Publishing Co.

Ebel, R. L. 1979. Essentials of Educational Measurement. (3rd ed.) Englewood Cliffs: Prentice Hall.

Engels, L. K. l968. The fallacy of word counts. IRAL 6,3: 2l3-23l.

Francis, W. Nelson and Kucera, Henry 1982. Frequency Analysis of English Usage. Boston: Houghton Mifflin Company.

Hindmarsh, R. l980. Cambridge English Lexicon. Cambridge: Cambridge University Press.

Hwang Kyong Ho 1989. Reading newspapers for the improvement of vocabulary and reading skills. Unpublished M.A. thesis, Victoria University of Wellington.

Nation, I.S.P. 1984. Vocabulary Lists. English Language Institute Occasional Publication No.12, Victoria University of Wellington.

Nation, I.S.P. 1990. Teaching and Learning Vocabulary. New York: Newbury House.

Schonell, F. J., I.G. Meddleton, and B.A. Shaw l956. A study of the oral vocabulary of adults. Brisbane: University of Queensland Press.

Sinclair, J. (ed.) l987. Collins Cobuild English Language Dictionary. London: Collins.

Thorndike, E. L. and I. Lorge l944. The Teacher's Word Book of 30,000 Words. Teachers College, Columbia University.

West, Michael l953. A General Service List of English Words. London: Longman, Green & Co.

Wodinsky, M. and I.S.P. Nation. 1988. Learning from graded readers. Reading in a Foreign Language 5,1: 155-161.

Chủ Nhật, ngày 15 tháng 6 năm 2008

Language Testing Blog: Welcome message

Language Testing Blog: Welcome message



Shiken: JALT Testing & Evaluation SIG Newsletter
Vol. 4 No. 1 Spring 2000 (p. 2 - 3) [ISSN 1881-5537]

Estimating vocabulary size

by David Beglar


What's the best way to estimate an EFL learner's vocabulary size? Are there any effective tests to estimate how many words EFL learners know? Are there any problems with standard vocabulary-estimating tests we should be aware of? This article explores these questions.

One institution in Japan which has made a rigorous attempt to measure the vocabulary of their students is Temple University Japan's Corporate Education Program (CEP). This program has been using a version of Paul Nation's Vocabulary Levels Tests (Nation, 1990). These tests are relatively straightforward, and are made up of sets of six words and three definitions, as in the following example:
                                            a.  royal      1.  _____  first                      b.  slow      2.  _____  not public                 c.  original      3.  _____  all added together         d.  sorry                                            e.  total                                            f.  private 

This test is designed to estimate examinees' basic knowledge of common word meanings, and, specifically, the extent to which they know the common meanings of words at the 2,000, 3,000, 5,000, 10,000 and university word levels. The test can be classified as a sensitive vocabulary test, which means that the format is sensitive to partial word knowledge. A less sensitive test (e.g., a multiple-choice cloze test focused on specific content words only) would result in lower scores even if the same words were tested.

Originally, the test had 90 items, but after being trailed with Japanese learners, the best performing 60 items chosen. It is fast and can be easily administered in twenty minutes. It is also reliable (Cronbach's alpha = .95 and Rasch reliability estimate = .97). In short, this test gives a general idea of the number of words an English speaker knows.

Paul Nation and Batia Laufer have both utilized versions of the Vocabulary Levels Tests to estimate vocabulary this way: if learner A scores 9 out of 12 (75%) on the 2,000 word level, s/he probably knows approximately 75% (1,500) of the first 2,000 words of English. If you continue to apply this logic to the results of the rest of the test (i.e., the 3,000, 5,000, University Word Level, and the 10,000 word level), you can arrive at an approximate estimate of vocabulary size.

Another way to measure vocabulary is to focus on words which are of greater importance and to test only those words. The advantage is that by focusing on a more narrow range of words, you can test more items and presumably arrive at more accurate estimates of what learners know. Beglar and Hunt (1999) did that with several versions of the 2,000 word level and University Word List tests. They trialed original pools of 72 items with native speakers of Japanese, selected the best performing 54 items for each test and made two 27-item parallel forms. You can find these forms in the appendix of Beglar and Hunt (1999), or e-mail to receive copies as attachments. In the same article, we also briefly discuss why the 2,000 and University Word List levels are important words for learners to know.

[ p. 2 ]


In addition to these tests, Paul Meara and several of his colleagues and students (e.g., Meara and Jones, 1987; Meara and Jones, 1990) have worked extensively with the Eurocentres Vocabulary Size Test.This is a checklist test in which the examinee checks the words he thinks he knows. I have seen published research which sometimes paints these tests in a very good and at other times in a very bad (e.g., low reliability) light. You can find more information on the tests at the Vocabulary Acquisition Research Group's homepage at www.swansea.ac.uk/cals/calsres.html. Several tests can be downloaded from the "freebies" section of this web page, such as the EVST (a basic vocabulary size test) and the LLEX 2.21 (a basic recognition vocabulary test). In addition, Paul Meara's students often have information about ongoing research posted on this webpage. In my experience, there is almost always an article posted concerning the Eurocentres Vocabulary Size tests.Finally, You can obtain a large number of these tests through the ERIC document reproduction service (see the Meara, 1992 reference below).

Finally, if you are interested in reading more about vocabulary testing, you might wish to take a look at the new text by John Read (2000) which has just been published by Cambridge. I can also recommend a book manuscript on vocabulary acquisition and teaching by Paul Nation (1999) which can be ordered by e-mail. In addition to learning everything about teaching vocabulary that you ever wanted to know, there is a good chapter on vocabulary testing which brings up a number of interesting issues such as vocabulary tests which are sensitive to differing levels of word knowledge.

References

Beglar, D., & Hunt, A. (1999). Revising and validating the 2000 word level and university word level vocabulary tests. Language Testing, 16 (2), 131-162.

Meara, P., & Jones, G. (1987). Tests of vocabulary size in English as a foreign language. Polyglot, 8 (1), 1-40.

Meara, P., & Jones, G. (1990). Eurocentres Vocabulary Size Test (version E1.1/K10,MSDOS). Zurich: Eurocentres Learning Service.

Meara, P. (1992). EFL vocabulary tests. Wales University: Swansea Centre fo Applied Language Studies. (ERIC Document Reproduction Service No. ED 362 046).

Nation, I. S. P. (1990). Teaching and learning vocabulary. New York: Newbury House.

Nation, I. S. P. (1999). Learning vocabulary in another language. English Language Institute Occasional Publication No. 19. Wellington, NZ: Victoria University of Wellington.

Read, J. (2000). Assessing vocabulary. Cambridge: Cambridge University Press.

[ p. 3 ]


Newsletter: Topic IndexAuthor IndexTitle IndexDate Index
TEVAL SIG: Main Page Background Links Network Join
www.jalt.org/test/beg_1.htm
Main Page
Teaching Acronyms


A B C D E F G H I J K L M N O P Q R S T U V W X Y Z


A
  • ABEEB - Association of British ESOL Examining Boards
  • ABLS - Association of British Language Schools
  • ADLTM - Advanced Diploma in English Language Teaching Management (UCLES)
  • ACoT - Associate of the College of Teachers
  • ACE - Adult and Community Education
  • ACP (TESOL) - Associate of the College of Preceptors (now ACoT Tesol)
  • A.Cert.TESOL - Advanced Certificate in TESOL (College of Teachers / Preceptors)
  • ACTDEC - Accreditation Council for TESOL Distance Education Courses
  • AE - Adult Education
  • AEB - Associated Examining Board
  • AEI - Adult Education Institute
  • ALTE - Association of Language Testers in Europe
  • ALTO - Association of Language Travel Organisations
  • ARELS - Association of Recognised English Language Services (now 'English UK')
  • AQA - Assessment and Qualifications Alliance
B
  • BAAL - British Association for Applied Linguistics
  • BACC - British Accreditation Council for Further & Higher Education
  • BALEAP - British Association of Lecturers in English for Academic Purposes
  • BALT - British Association for Language Teaching
  • BASELT - British Association of State English Language Teaching (now 'English UK')
  • BATQI - British Association of TESOL Qualifying Institutions
  • BC - British Council
  • BEC - Business English Certificate (UCLES)
  • BIELT: British Institute of English Language Teaching
  • BTEC - Business and Technology Education Council
  • BULATS - Business Language Testing Service (UCLES)
C
  • CAE - Certificate in Advanced English(UCLES)
  • CALL - Computer Assisted Language Learning
  • CBEVE - Central Bureau for Educational Visits and Exchanges
  • CCSE - Certificates in Communicative Skills in English (UCLES)
  • CEELT - Cambridge Examination in English for Language Teachers (UCLES)
  • CEIBT - Certificate in English for International Business and Trade (UCLES)
  • CELTA - Certificate in English Language Teaching to Adults (UCLES)
  • CELTYL - Certificate in English Language Teaching to Young Learners (UCLES)
  • Cert(ES) TESOL - Certificate of Educational Studies in TESOL (ACTDEC)
  • CertTEfIC - Specialist Certificate in Teaching English for Industry and Commerce (Trinity College)
  • CertTEB - Certificate in Teaching English for Business (LCCIEB)
  • Cert (TM) TESOL - Certificate in the Theory and Methodology of TESOL (ACTDEC)
  • CfBT - Centre for British Teachers
  • CGLI - City and Guilds of London Institute
  • CILT - Centre for Information on Language Teaching and Research
  • CILTS - Cambridge Integrated Language Teaching Schemes (UCLES)
  • CNAA - Council for National Academic Awards
  • COTE - Certificate for Overseas Teachers of English (UCLES)
  • CPE - Certificate of Proficiency in English (UCLES)
  • CRELS - Combined Registered English Language Schools (New Zealand)
  • CTEFLA - Certificate in Teaching English as a Foreign Language to Adults (UCLES - replaced by CELTA)
D
  • DELTA - Diploma in English Language Teaching to Adults (UCLES)
  • DELTYL - Diploma in English Language Teaching to Young Learners (UCLES)
  • DES - Diploma in English Studies (UCLES)
  • DFEE - Department for Education and Employment
  • Dip.BPE - Diploma in Business and Professional English Language Teaching (UCLES)
  • Dip.CoT - Diploma of the College of Teachers
  • Dip.CP (TESOL) - Diploma of the College of Preceptors (now Dip.CoT Tesol)
  • Dip.TESAL - Diploma in the Teaching of English to Speakers of Asian Languages (Trinity College)
  • Dip.TIB - Diploma in Teaching English for Business (LCCIEB)
  • Dip (TM) TESOL - Diploma in the Theory and Methodology of TESOL (ACTDEC)
  • Dip (TM) TESP - Diploma in the Theory and Methodology of TESP (ACTDEC)
  • DOTE - Diploma for Overseas Teachers of English (UCLES - replaced by DELTA)
  • DTEFLA - Diploma in Teaching English as a Foreign Language to Adults (UCLES – replaced by DELTA)
E
  • EAP - English for Academic Purposes
  • EAQUALS - European Association for Quality Language Services
  • EAL - English as an Acquired Language
  • EAT - European Association of Teachers
  • EFB - English for Business (LCCIEB)
  • EFC - English for Commerce (LCCIEB)
  • EFL - English as a Foreign Language
  • EFTI - English for the Tourism Industry (LCCIEB)
  • ELICOS - English Language Intensive Courses to Overseas Students (Australia)
  • ELL – English Language Learner
  • ELSA - English Language Skills Assessment (LCCIEB)
  • ELT - English Language Teaching
  • EOP - English for Occupational Purposes
  • ESB - English Speaking Board
  • ESL - English as a Second Language
  • ESOL - English to Speakers of Other Languages
  • ESP - English for Specific Purposes
  • ESU - English Speaking Union
F
  • FCoT - Fellow of the College of Teachers (UK)
  • FCE - First Certificate in English (UCLES)
  • FE - Further Education
  • FIELS - Federation of Independent English Language Schools (New Zealand)
  • FIRST - Association of British language schools.
  • FIYTO - Federation of International Youth Travel Organisations
  • FLIC - Foreign Languages for Industry and Commerce (LCCIEB)
  • FTBE - Foundation Certificate for Teachers of Business English (LCCIEB)
G
  • GNVQ - General National Vocational Qualification (UK)
H
  • HE - Higher Education
  • HNC - Higher National Certificate (BTEC)
  • HND - Higher National Diploma (BTEC)
  • HPA - Home Providers Association
I
  • IATEFL - International Association of Teachers of English as a Foreign Language
  • IB - International Baccalaureate
  • IDLTM - International Diploma in English Language Teaching Management (UCLES)
  • IDPA - International Development Program of Australia
  • IELTDHE - Institute for English Language Teacher Development in Higher Education
  • IELTS - International English Language Testing System (UCLES-British Council-IDPA)
  • IoL - Institute of Linguists
J
  • JCLA - Joint Council of Language Associations
  • JET - Japan Exchange and Teaching Programme
  • JET - Junior English Tests (AQA exam)
K
  • KET - Key English Test (UCLES)
L
  • LABCI - (Association of) Latin American British Cultural Institutes
  • LAGB - Linguistics Association of Great Britain
  • LCCIEB - London Chamber of Commerce and Industry Examinations Board
M
  • MLA - Modern Language Association
N
  • NAAE - National Association of Advisors in English
  • NALA - National Association of Language Advisors
  • NATE - National Association for the Teaching of English
  • NATECLA - National Association for the Teaching of English and Community Languages to Adults
  • NATEFLI - National Association of TEFL in Ireland
  • NATESOL - National Association of Teachers of English for Speakers of Other Languages (United States)
  • NATFHE - National Association of Teachers in Further and Higher Education (UK)
  • NCILT - National Centre for Industrial Language Training
  • NCLE - National Congress on Languages in Education
  • NCML - National Council for Modern Languages in Higher and Further Education
  • NCVQ - National Council for Vocational Qualifications
  • ND - National Diploma (BTEC)
  • NEAB - Northern Examinations and Assessment Board
  • NELLE - Networking English Language Learning in Europe
  • NESB - Non-English Speaking Background
  • NIACE - National Institute of Adult Continuing Education
  • NVQ - National Vocational Qualification
O
  • ODL - Open and Distant Learning
  • OFSTED - Office for Standards in Education (UK)
  • OIBEC - Oxford International Business English Certificate (UODLE)
  • OL - Open Learning
  • OU - Open University
P
  • PBE - Practical Business English (LCCIEB)
  • PEI - Pitman Examinations Institute
  • PET - Preliminary English Test (UCLES)
  • PGCE - Post Graduate Certificate in Education
  • Pre-Cert (ES) TESOL - Preliminary Certificate of Educational Studies in TESOL (ACTDEC)
Q
  • QTS - Qualified Teacher Status
R
  • RELSA - Recognised English Language Schools Association (Ireland)
  • RSA - Royal Society of Arts
S
  • SALT - Scottish Association for Language Teaching
  • SATEFL - Scottish Association for the Teaching of English as a Foreign Language
  • SATESL - Scottish Association for the Teaching of English as a Second Language
  • SCOTTESOL - Scottish TESOL Association
  • SEFIC - Spoken English for Industry and Commerce (LCCIEB)
  • SESOL - Spoken English for Speakers of Other Languages (Trinity College)
  • SET - Senior English Tests (AQA Exam)
  • SIETAR - Society of International English Cultural Training and Research
  • SNVQ - Scottish National Vocational Qualification
T
  • TEAL - Teaching English as an Additonal Language
  • TEC - Training and Enterprise Council
  • TEEP - Test of English for Educational Purposes
  • TEFL - Teaching English as a Foreign Language
  • TEIL -Teaching English as an International Language
  • TESP - Teaching English for Specific Purposes
  • TEP - Tourism English Proficiency (UODLE)
  • TESL - Teaching English as a Second Language
  • TESOL - Teaching English to Speakers of Other Languages
  • TEVAC - Teaching English for Vacation and Activity Courses
  • TOEFL - Test of English as a Foreign Language
  • TOEIC - Test of English for International Communication
  • TP - Teaching Practice
U
  • UCLES - University of Cambridge Local Examinations Syndicate
  • UETESOL - University Entrance Test for Speakers of Other Languages (NEAB)
  • UKCOSA - United Kingdom Council for Overseas Student Affairs
  • ULEAC - University of London Examinations and Assessment Council
  • UODLE - University of Oxford Delegacy of Examinations
V
  • VSO - Voluntary Services Overseas
W
  • WEFT - Written English for the Tourism Industry (LCCIEB)
  • WYSTC - The World Youth and Student Travel Conference
X-Y-Z
  • There are no acronyms in this section