Matthew Scillitani on High-Range Psychometrics, Validity, and Test Security

How does Matthew Scillitani approach validity, bias, score interpretation, and security in supervised high-range psychometric testing?

By Scott Douglas JacobsenPublished about 2 hours ago • 5 min read

Matthew Scillitani is a psychometrics practitioner at Neurolus Psychometrics focused on developing supervised, time-limited high-range ability examinations. He co-launched The Mental Inventor with Paul Cooijmans as an empirical testbed for a central measurement question: whether performances can be validly differentiated in the extreme right tail under proctored conditions. His approach emphasizes procedural integrity—identity verification, approved proctoring, and rule enforcement—alongside cautious claims about interpretation until reliability and validity evidence is established. He highlights emerging threats to unsupervised testing, including AI-assisted responding and large-scale collaboration, and advocates peer review before formal reclassification.

Scott Douglas Jacobsen interviews Matthew Scillitani on the psychometric ambitions and safeguards behind supervised, time-limited high-range testing at Neurolus Psychometrics and The Mental Inventor. Scillitani explains that exploratory validity work may begin at 50 submissions, with stronger analyses at 100 or 250, using prior candidate data to reduce sample requirements. He stresses moderate cross-section correlations as evidence of broad reasoning, transparent reporting of selection bias, and strict standards for excluding compromised sittings. The discussion also addresses score uncertainty, interpretive restraint, third-party misuse, and evolving security threats, including answer leakage, collaboration, and AI-era integrity concerns.

Scott Douglas Jacobsen: What empirical threshold would move from exploratory data collection to a formal validation study?

Matthew Scillitani: We plan to start exploring construct validity at 50 submissions, with follow-up analyses at 100 and, if needed, 250 submissions.

Intuitively, these sound like too small samples, but understand that we are not generating norms from nothing. In many cases, we already know candidates' prior scores on related exams. This allows us to use methods like rank equation, reducing sample size requirements compared to traditional norming methods that use an unselected population.

That said, 50 submissions may only support an exploratory analysis. The initial goal is to observe trends and possible construct measurement. If results are inconclusive, we would continue collecting data and re-evaluate at 100 and again at 250 submissions.

Jacobsen: The exam includes verbal, numerical, and spatial items. What evidence would reflect broad reasoning ability rather than an unusually strong specialty cognitive profile?

Scillitani: The appropriate approach here is to examine the relationship between the three sections. If they correlate very highly, it suggests they measure more or less the same thing. Ideally, there is a moderate positive intercorrelation across sections, such as 0.4 to 0.6.

Jacobsen: Eligibility is limited to English-speaking adults who can arrange an approved proctor. How will you estimate the selection bias built into it?

Scillitani: This exam necessarily produces a selective sample because it requires candidates to be English-speaking adults, find a proctor, and have the willingness to sit for a challenging exam.

We intend to document this clearly and publish aggregate (anonymized) candidate characteristics in the statistical reports. This includes country, age, sex, and other relevant demographics, making the sample in question clear to both researchers and candidates.

Jacobsen: If results begin to differ systematically by proctor type or testing environment, what would count as enough distortion to justify excluding sittings?

Scillitani: The decision to exclude data is serious because post hoc removal of inconvenient results can permanently damage the integrity of our research. It is best to exclude data only when there is clear, well-documented evidence that the sitting was objectively compromised by cheating or improper testing procedures.

That evidence does not necessarily need to be a confession, but may also be evident in the statistics. For example, anomalous response patterns such as impossibly similar responses in two submissions from the same town, or a documented mishap such as a candidate needing to exit the exam early.

We will internally document any exclusions so that peer reviewers can judge for themselves whether those exclusions were justified.

Jacobsen: Retesting is not permitted. How do you plan to estimate the uncertainty around an individual high-end score?

Scillitani: Uncertainty will be estimated psychometrically via reliability and the standard error of measurement.

Scores, outliers or not, should always be understood in the context of their margin of uncertainty, which candidates and organizations will know when the first statistical report is published.

Jacobsen: You provide scaled scores. How do you prevent the scale from encouraging stronger conclusions than intended?

Scillitani: This is both a technical and ethical issue. At this early stage, scaled scores are used because the exam is not yet standardized, and we do not want the terminology to imply greater normative maturity than is warranted.

We also do not present these scores as measuring I.Q. or any other construct, both in score reports and on the website. The score conversion table exists only to provide candidates and organizations with a point of reference, not to make any claims the data cannot yet support.

Jacobsen: If outside organizations use the exam, where will responsibility begin and end in preventing overclaiming or misuse of results?

Scillitani: Third parties are responsible for the claims they make. However, that does not mean that publishers have no responsibility at all. We provide clear documentation, interpretive limits, and as much statistical information as possible so that nobody is misled.

Jacobsen: As AI systems and answer-sharing methods improve, how will you update the exam to preserve security?

Scillitani: AI is not yet a major concern because the exam procedure disallows electronics, making it inaccessible during the exam. But there are more immediate security concerns, such as answer leakage.

Unusually similar response patterns or geographically clustered irregularities will be flagged for review. And if a specific location or proctoring option shows signs of compromise, we will investigate and resolve the issue in the fairest way possible.

Over time, this may require tactical countermeasures, but I cannot say publicly what they would be. Measures may already be in place to identify compromised sittings so that legal action can be taken against the culprit.

Jacobsen: Thank you very much for the opportunity and your time, Matthew.

Scott Douglas Jacobsen is a blogger on Vocal with over 130 posts on the platform. He is the Founder and Publisher of In-Sight Publishing (ISBN: 978–1–0692343; 978–1–0673505) and Editor-in-Chief of In-Sight: Interviews (ISSN: 2369–6885). He writes for International Policy Digest (ISSN: 2332–9416), The Humanist (Print: ISSN, 0018–7399; Online: ISSN, 2163–3576), Basic Income Earth Network (UK Registered Charity 1177066), Humanist Perspectives (ISSN: 1719–6337), A Further Inquiry (SubStack), Vocal, Medium, The Good Men Project, The New Enlightenment Project, The Washington Outsider, rabble.ca, and other media. His bibliography index can be found via the Jacobsen Bank at In-Sight Publishing,, comprising more than 10,000 articles, interviews, and republications across more than 200 outlets. He has served in national and international leadership roles within humanist and media organizations, held several academic fellowships, and currently serves on several boards. He is a member in good standing in numerous media organizations, including the Canadian Association of Journalists, PEN Canada (CRA: 88916 2541 RR0001), Reporters Without Borders (SIREN: 343 684 221/SIRET: 343 684 221 00041/EIN: 20–0708028), and others.

Photo by Shana Van Roosbroek on UnSplash.

interview

About the Creator

Scott Douglas Jacobsen

Scott Douglas Jacobsen is the publisher of In-Sight Publishing (ISBN: 978-1-0692343) and Editor-in-Chief of In-Sight: Interviews (ISSN: 2369-6885). He is a member in good standing of numerous media organizations.

Reader insights

Be the first to share your insights about this piece.

How does it work?

Add your insights

Comments

There are no comments for this story

Be the first to respond and start the conversation.

Keep reading

More stories from Scott Douglas Jacobsen and writers in Education and other communities.

Matthew Scillitani on High-Range Psychometrics, Validity, and Test Security

How does Matthew Scillitani approach validity, bias, score interpretation, and security in supervised high-range psychometric testing?

About the Creator

Scott Douglas Jacobsen

Reader insights

Be the first to share your insights about this piece.

Comments

Keep reading

Dr. Angelos Sofocleous on Melancholia: Ancient Philosophy, Depression, and the Good Life

Dr. Dipak Gade Secures New Patent for Automated Software Assurance Case Generation: A Milestone in Mission‑Critical Software Certification

The Price of Greed

Vocal Bonus Leaderboard: 03/25/2026