Item response theory (IRT) methods were used to develop a neuropsychological test battery with matched English and Spanish language forms. Candidate items for 12 scales measuring core neuropsychological abilities were generated and administered to 200 community-dwelling elderly participants tested in Spanish and 208 tested in English. IRT methods were used to eliminate linguistically biased items and refine scales to assess broad ability ranges. Reasonably good psychometric matching of scales was achieved within and across English and Spanish language forms. All scales were sensitive to cognitive impairment as measured by the Mini-Mental State Examination (MMSE), with highly similar relationships between scale scores and MMSE across English and Spanish groups. The outcome supports the use of IRT methods in cross-cultural and multilingual test development and indicates that this strategy has potential for future neuropsychological test development.