
Assessment tools in the AI era: a practical market map for universities
By Naomi Rowan, Founder & Consultant, Gratitude Worldwide Ltd Published: April 2026 | Last updated: 6 May 2026
How to tool up for assessment change without automating confusion
Note: This is a public-source market map, not a procurement recommendation. Product features, integrations and licensing vary by institution, configuration and contract. Always check details with vendors, procurement colleagues, internal technical teams and relevant policy owners.
In brief
Universities are not short of assessment tools. The harder task is matching the tool to the assessment problem.
Some tools strengthen existing LMS workflows. Some provide more visibility into written assessment. Some support similarity checking, originality review or AI-writing indicators. Others focus on secure exams, scanned scripts, AI-supported feedback, peer assessment, portfolios, STEM assessment, coding tasks or Moodle-specific marking and verification.
A useful market scan separates these categories before comparing products.
The strongest decision is not the one with the longest feature list but the one that fits the assessment type, workflow, integration environment, policy position, support model and staff/student adoption needs.
The challenge
Universities are under pressure to make assessment technology decisions faster than many teams would like.
Institutions do not need more philosophy; they need workload relief.
While this is true, staff capacity is not separate from educational quality. Staff cannot exercise careful judgment inside broken workflows.
AI has raised difficult questions about authorship, academic integrity, feedback, marking workload, student guidance and the future of coursework. At the same time, many institutions are still working with assessment processes that have grown around Moodle or Canvas settings, spreadsheets, manual checks, departmental variation, local workarounds and unclear handoffs.
It is easy for the conversation to move quickly to platform comparisons, especially when teams are juggling AI, workload, academic integrity, feedback quality and assessment policy at the same time. I would slow the conversation down enough to look at the work itself: what needs to become clearer, more consistent, more scalable or more trustworthy?
Once that is visible, the technology conversation becomes sharper. A secure exam platform, a process-visibility tool, a similarity checker, an AI-supported feedback tool and a peer review platform all solve different problems. Choosing between them before understanding the workflow is how institutions end up automating confusion.
The right tool can make a real difference, but the strongest technology decisions usually come after the workflow has been understood.
A useful distinction
Not all friction is equal.
Some friction is waste: duplicated admin, unclear handoffs, manual grade handling, confusing platform steps, inconsistent guidance and processes that rely on local memory.
Some effort is valuable: drafting, judging, moderating, interpreting evidence, giving feedback and helping students understand how to improve.
Assessment technology should reduce the first without accidentally removing the second.
If your institution is reviewing assessment tools, the useful first step is often not a demo. It is a clear view of the current assessment workflow: where effort is being lost, where judgement needs support, and what any tool would need to make more workable.
The alternative to unsupported automation is not unsupported human judgement. The alternative is accountable judgement: clear criteria, transparent evidence, moderation, escalation routes and human interpretation where decisions affect students.
Why AI has made the tooling question more urgent
AI has not created every assessment problem universities are now facing. In many cases, it has made existing problems harder to ignore.
Assessment design was already uneven in places. Feedback workflows were already time-intensive. Marking and moderation already varied between departments. Moodle and Canvas were already being stretched in different ways, and staff already needed clearer guidance, better adoption support and more workable processes.
AI has added pressure because the stakes now feel higher. Institutions are thinking about evidence, fairness, workload, academic integrity, student communication, staff confidence, governance and acceptable use at the same time.
Jisc’s AI in assessment work is a useful sector signal because it separates two strands universities are already navigating: specialist tools designed for marking and feedback, and general-purpose AI tools being used within controlled assessment workflows. The value of that work is not simply the vendor list; it is the implementation detail it surfaces around quality, workload, human oversight, setup, governance and staff confidence.
EDUCAUSE’s 2026 research on AI and work in higher education shows the same implementation gap from another angle: institutional AI strategies are becoming more common, but policy awareness, ROI measurement, tool governance and day-to-day staff use are still uneven.
This is where assessment technology decisions become complex. Universities are not simply choosing software; they are making decisions about evidence, trust, workload, integration, support and educational purpose.
The market is not one market
The assessment technology market can look like a single category from the outside. In practice, it is a set of overlapping tool families that solve different problems.
LMS-native tools support submission, marking, feedback and quiz workflows. Process-visibility tools give more context around how written work is produced. Similarity and originality tools provide document-level signals. AI-supported feedback tools sit closer to marking and review. Digital exam platforms focus on secure delivery and exam management. Portfolio and peer-review tools support different kinds of assessment design. STEM and programming tools often solve highly discipline-specific problems.
Grouping all of these together as “assessment tools” makes procurement conversations harder than they need to be. It also increases the risk that a university chooses a strong product for the wrong use case.
Market map at a glance
Table 1: Market map at a glance
If the main problem is… | Look at… | Examples | What to check |
|---|---|---|---|
Improving existing assignment, quiz or marking workflows | LMS-native assessment tools | Moodle Assignment, Moodle Quiz, Canvas Assignments, SpeedGrader, New Quizzes | Is the existing LMS being used well, or are unclear processes making the platform look like the problem? |
Understanding written-assessment process and authorship | Written-assessment and process-visibility tools | Cadmus | Do you need visibility into how work is produced, not only a final submission check? |
Checking similarity, originality or possible AI-written text | Similarity/originality tools | Turnitin, Inspera Originality | What kind of evidence is produced, and how will staff interpret it fairly? |
Supporting AI-assisted marking or feedback | AI-supported marking and feedback tools | Graide, KEATH, TeacherMatic, general-purpose AI tools used within controlled workflows | What does human oversight mean, and how will quality, consistency and workload be evaluated? |
Delivering secure digital exams at scale | End-to-end digital exam platforms | Inspera, WISEflow, Ans, ExamSoft, BetterExaminations, Questionmark, Cirrus | Are exam design, delivery, invigilation, accessibility, marking, grade return and support all in scope? |
Marking handwritten, scanned or problem-based work at scale | Online grading tools | Gradescope, Crowdmark | Who scans, allocates, marks, moderates, releases and resolves grades? |
Supporting peer, group, reflective or authentic assessment | Peer, portfolio and evidence tools | FeedbackFruits, PebblePad, Watermark | Does the tool strengthen assessment design rather than only monitor behaviour? |
Automating discipline-specific assessment | STEM, maths and programming tools | STACK, CodeRunner, Numbas, Möbius, CodeGrade | Does the tool fit the discipline, question type and staff capability? |
Strengthening marking, moderation or verification in Moodle | Moodle/Totara workflow extensions | Accipio Grade, Moodle Coursework/double-marking options | Can the institution improve Moodle workflows before procuring a separate platform? |
How to use this market map
This page is designed to support early thinking. It is not a replacement for procurement, technical review or supplier due diligence.
For each tool category, it helps to be clear about:
-
the assessment types in scope;
-
where the current workflow is creating friction;
-
which systems need to connect;
-
what data the tool will produce;
-
who will interpret that data;
-
what manual work will remain;
-
what policy or guidance needs to exist;
-
what staff and students will need to do differently;
-
what a realistic pilot should test.
Most assessment technology decisions are not purely technical. They are decisions about educational purpose, process, people, evidence, trust and support.
Start with what the LMS already does
Before looking outward, it is worth understanding what Moodle, Canvas or another institutional LMS can already support.
Moodle Assignment can support submission, grading, feedback, rubrics or marking guides, marking workflow and marker allocation, depending on configuration. Canvas offers native assignment, grading, rubric and moderated grading workflows through tools such as SpeedGrader and related assessment features.
That does not mean the LMS can do everything - it often cannot.
It does mean that some “platform problems” are actually process problems. A department may be using Moodle differently from another department. Marking states may not be consistently applied. Offline marking may rely on local workarounds. Guidance may be thin. The gradebook may not match the assessment process people think they are running.
Sometimes a new platform is needed. Sometimes better configuration, clearer workflows, improved guidance or targeted development will get further. Often, the answer is a combination.
Written assessment and process visibility
AI has increased interest in tools that show more of the student writing process.
Cadmus sits in the written-assessment and process-visibility part of the market. It provides a structured environment for written tasks and integrates with Moodle and Canvas through LTI 1.3, including launch, grade pass-back and membership synchronisation. Its Activity Reports give staff process-level information such as writing activity, work sessions and pasted content.
That makes Cadmus relevant where an institution wants more context around how work was produced, rather than relying only on a final-file submission.
Process visibility can support better conversations and stronger evidence, but does not remove the need for judgement. Institutions still need to decide how the data will be interpreted, who will review it, what students will be told, and what will happen when the evidence is ambiguous.
Turnitin sits in a different part of the landscape. It is best understood as a similarity, feedback and academic-integrity workflow tool around submitted documents, rather than a full writing-process environment. Its AI-writing guidance is also clear that an AI-writing score should not be used as the sole basis for adverse action against a student.
A similarity report, an AI-writing indicator, a writing-process report, an exam log and a draft history are all different forms of evidence. Treating them as interchangeable creates risk.
AI-supported marking and feedback tools
Tools such as Graide, KEATH and TeacherMatic sit in the AI-supported marking and feedback category. Jisc’s current pilot is exploring these specialist tools alongside wider work with general-purpose AI tools such as ChatGPT, Claude, Gemini and Copilot.
This is one of the most interesting and most sensitive parts of the assessment technology landscape.
Detection is only one strand of the AI conversation; the more operational issue is whether AI can support clearer, more consistent or more timely feedback without weakening academic judgement.
A tool may generate feedback quickly, although the real test is whether that feedback is accurate, fair, criteria-aligned, transparent to students and manageable for staff once setup, checking, and governance are included.
For universities, this changes the shape of a pilot. The pilot should not only ask whether the tool “works.” It needs to ask what kind of work the tool changes.
-
Does it reduce workload, or move workload into setup and review?
-
Does it improve feedback quality, or produce more feedback of uneven usefulness?
-
Does it support marker judgement, or create pressure to accept AI output too quickly?
-
Does it help students understand how to improve, or simply make the feedback process faster?
Human judgement can be biased; automation may be more consistent.
That is exactly why human judgement needs structure, transparency and moderation.
AI-supported feedback may become valuable, but it needs to sit inside a clear assessment workflow rather than around the edges of one. I am not arguing for less technology. I am arguing for careful decision-making about what technology should and should not be used for.
End-to-end digital exam and assessment platforms
Some institutions are not primarily trying to improve coursework. They are trying to manage formal digital exams, secure assessment delivery, exam authoring, invigilation, marking, moderation, feedback release and reporting at scale.
That is where platforms such as Inspera, WISEflow, Ans, ExamSoft, BetterExaminations, Questionmark and Cirrus may enter the conversation.
These platforms can make exam and assessment processes more coherent, especially where institutions need secure delivery, structured administration, marking workflows, integrity controls, reporting and integration with existing systems. They also tend to touch many parts of the university at once: accessibility, accommodations, identity, device readiness, invigilation, student records, grade return, exception handling, support models and staff workload.
Used well, an end-to-end assessment platform can create more consistent institutional processes. Used without enough workflow design, it can become a large technical project that still relies on manual workarounds.
Scanned work, handwritten assessment and marking at scale
For some disciplines, the pressure is not essay authenticity or secure exams. It is the practical challenge of grading large volumes of handwritten, problem-based or technical work consistently.
Gradescope and Crowdmark are good examples in this space. They can help with scanned scripts, problem sheets, rubric consistency, large marking teams, regrades and feedback return.
These tools are useful because they focus on a specific grading pain point. The surrounding operating model still matters: who scans, who uploads, how markers are allocated, how moderation works, how grades are returned, and what happens when scripts are missing, unreadable or incorrectly matched.
A grading tool can streamline marking, but it still needs a clear process around it.
Peer, group, portfolio and authentic assessment
Not every assessment challenge is about detection, security or marking speed.
Some institutions are trying to design more authentic, reflective, collaborative or longitudinal assessment, which points to a different part of the market.
FeedbackFruits supports peer feedback, group work and other learning activities, while PebblePad is often used for portfolios, reflection, employability, placement and evidence over time. Watermark sits closer to outcomes, evidence collection, accreditation and programme-level assessment processes.
This category matters because AI-era assessment conversations can become too narrow when they focus only on misuse.
Universities also need tools and workflows that support better assessment design: staged tasks, feedback loops, peer interaction, reflection, evidence of process, authentic outputs and programme-level learning. In this area, the key test is whether the tool helps staff and students engage in a better assessment process.
STEM, maths and programming assessment
Some assessment problems are highly discipline-specific.
STACK is an open-source online assessment system for mathematics and STEM. Numbas is an open-source online assessment system for mathematical subjects, developed at Newcastle University, with randomised questions, automatic marking and instant feedback. CodeRunner and CodeGrade sit closer to programming assessment, code grading and automated feedback workflows.
This is a useful reminder that “assessment technology” does not always mean a whole-institution platform.
For some departments, the best fit may be a specialist assessment engine, a Moodle plugin, a code autograder or a discipline-specific workflow. For others, that may be too narrow or too dependent on local technical expertise.
The decision depends on assessment type, staff capability, institutional support and how much local variation the university is prepared to sustain.
Moodle-specific marking and verification extensions
For Moodle institutions, another category is easy to miss: tools and modules that extend Moodle rather than replace it.
Accipio Grade is positioned as an advanced grading and quality-assurance tool for Moodle and Totara, with features around internal and external verification, allocation, blind marking and grade resolution. Moodle also has Coursework and double-marking options in the wider Moodle ecosystem, including workflows around marking, allocation, double marking and grade agreement.
This category is important when the core issue is not “we need a new assessment platform,” but “our Moodle assessment workflow needs stronger marking, moderation, verification or quality assurance.”
If the institution is already committed to Moodle, the choice is not always Moodle versus a new platform. It may be better Moodle configuration, targeted plugins, bespoke development, workflow redesign, or an external tool for a specific assessment type.
What “text tracking” really means
One of the most common AI-era questions is whether a tool can “track the text.”
That phrase needs unpacking because it can refer to very different kinds of evidence.
Table 2: Text tracking evidence
Type of evidence | What it may show | What it does not automatically prove |
|---|---|---|
Writing-process data | Editing patterns, work sessions, pasted content, drafts or activity timelines | That misconduct has or has not occurred |
Similarity reports | Text matches with other sources | Whether the match is inappropriate or whether misconduct occurred |
AI-writing indicators | A statistical signal that text may have been AI-generated | Proof of AI use or misconduct |
Exam logs | Access, actions, timings or answer changes during an assessment | The full context behind a student’s behaviour |
Version history | How a file or document changed over time | Why those changes were made or who made every decision |
Declarations or process evidence | Student explanation of tools, drafting, sources or AI use | Independent verification of the whole process |
The practical issue is not simply whether data exists. It is whether the institution has a fair, transparent and workable process for interpreting that data.
Cadmus-style process visibility, Turnitin-style similarity or AI-writing indicators, exam logs and student declarations all provide different signals. None of them removes the need for academic judgement, policy clarity and careful communication.
This is where many institutions need to slow down, not because they are being indecisive, but because the evidential and procedural stakes are high.
Integration: what actually needs to move?
“Integrates with Moodle” or “integrates with Canvas” can mean several different things.
It might mean single sign-on, LTI launch, roster synchronisation, grade pass-back, group sync, deep linking, API access, student-record integration, or several of these in combination.
That difference is important because a technically available integration may still leave manual work in exactly the place the institution was trying to reduce it.
For example, one tool may launch neatly from the LMS but not return grades in the required format. Another may sync users but not groups. Another may work for one assessment type but leave moderation, exceptions or external review outside the platform. At small scale, those gaps may be manageable; at institutional scale, they can become the work.
Instead of asking only whether a tool integrates, I’d want to know what data moves, when it moves, who triggers it, what happens when it fails, whether grades return cleanly, whether groups and markers are supported, and whether student-record integration is needed.
This is often where the success or failure of an assessment tool is decided. The demo may look smooth, while the institutional workflow still depends on hidden manual steps.
Questions to ask before choosing or piloting a tool
Before choosing or piloting an assessment tool, I would want the university team to be clear on these questions:
-
Which assessment types does the tool support well?
-
Which assessment types does it not support well?
-
Does it support Moodle, Canvas, Blackboard or Brightspace through LTI 1.3, API integration, grade pass-back, roster sync, group sync or deep linking?
-
Does it strengthen student agency, or only increase monitoring?
-
Does it make expectations clearer for students, or only make staff processes faster?
-
Does it reduce inequity, or advantage students and staff who already have better AI access, confidence or support?
-
What evidence or analytics does it produce?
-
Who will interpret that evidence?
-
Does it provide writing-process visibility, similarity checking, AI-writing indicators, exam logs, grading workflow or portfolio evidence, and which of those are actually needed?
-
How does it handle accessibility, extensions, accommodations, late submissions, resits, exceptional circumstances and academic-integrity workflows?
-
What manual work remains outside the platform?
-
Who configures assessments, allocates markers, manages moderation, releases feedback and resolves errors?
-
What training and support will academic staff, professional services teams and students need?
-
What data protection, retention, consent and transparency questions need to be answered?
-
What would success look like after a pilot?
These questions sit where procurement, policy and workflow meet. Without that clarity, an institution may buy a tool that solves the wrong problem.
What to look for in a pilot
A good pilot needs to test more than basic functionality - does it work inside the institution’s real assessment conditions?
That means testing:
-
the assessment types that matter most;
-
the marking and moderation workflow;
-
student access and communication;
-
staff setup time;
-
feedback quality;
-
exception handling;
-
grade return;
-
support burden;
-
accessibility and accommodations;
-
how evidence or analytics are interpreted;
-
whether the tool reduces or redistributes workload.
It is also worth being honest about what the pilot cannot prove. A small pilot may show usability and workflow fit, but not whole-institution scalability. A controlled use case may not reveal departmental variation. A positive staff experience in one context may not translate automatically to another.
Pilots are useful, but they need to be designed around the decisions the institution actually needs to make.
The risk of automating confusion
Assessment technology can make a real difference. It can make workflows more visible, reduce repetitive administration, support better feedback, help staff manage large cohorts, create stronger evidence trails and give institutions a more practical way to respond to AI.
The risk is that technology can also move confusion faster. An unclear moderation process remains unclear inside a new platform. A policy gap does not disappear because a tool has been configured. A weak integration may simply move manual work into a different team. An AI indicator may create more risk if staff are not supported to interpret it carefully.
This is why I would usually start with the workflow rather than the demo. A focused diagnostic can help the institution understand what is happening now, where the friction sits, what AI changes, which platform questions matter, and whether the next step is redesign, Moodle support, a specialist tool, a pilot or ongoing implementation support.
Frequently asked questions
What is the best assessment tool for universities?
There is no single best tool. The right fit depends on the assessment type, workflow, integration environment, policy context, staff capability, student experience and support model.
Should universities choose an AI assessment tool first?
I would usually start with the assessment workflow, not the AI tool. Once the institution is clear about the evidence it needs, the policy position, the implementation conditions and the kind of staff/student support required, the tool decision becomes much more focused.
What is the difference between text tracking and similarity checking?
Text tracking usually refers to evidence about how work was produced, such as drafting, editing, pasted content or activity timelines. Similarity checking compares the submitted text with other sources. They are different kinds of evidence, and they need different interpretation.
Can AI detection prove academic misconduct?
No. AI-writing indicators should be treated as signals, not proof. Institutions need fair academic-integrity processes, staff guidance and clear student communication before using those indicators in decision-making. Turnitin’s own guidance says its AI-writing score should not be used as the sole basis for adverse action against a student.
Do Moodle or Canvas already support assessment workflows?
Yes. Moodle and Canvas already support many assessment workflows, including submission, grading, feedback, rubrics and gradebook processes. Complex marking, moderation, external review, offline marking or cross-department workflows may still need additional configuration, plugins, process redesign or specialist tools.
What should universities do before procuring an assessment platform?
Map the current workflow, identify friction points, clarify requirements, understand integration needs, define pilot success criteria and plan staff/student adoption. A platform decision made before that work is clear can easily automate confusion.
The best tool is the one that fits the work
There is no single best assessment platform for universities. There are better and worse fits for particular assessment types, institutional contexts and workflows.
A university reviewing written coursework may need a different solution from one reviewing secure exams. A STEM department may need a different tool from a portfolio-heavy professional programme. A Moodle institution with complex marking and moderation may not need the same approach as a university trying to replace a whole assessment-management ecosystem.
The most important work is to understand the assessment workflow before choosing the technology. That means looking carefully at academic practice, marking and moderation, feedback expectations, student experience, policy and quality requirements, platform and data flows, AI-related risks, staff workload, adoption, governance and support.
The tool matters, but more important, is the strategy that underpins its use.
The strategy is to make assessment and feedback work better in practice, for students, staff, departments and the wider institution.
In the AI era, that means tooling up carefully. Universities need to reduce the friction that wastes time and attention, while protecting the intellectual and relational work that assessment still needs: agency, judgement, feedback, moderation, trust and learning.
That is how institutions avoid automating confusion and make change workable.
Related support
If you need help turning this market scan into requirements, pilot criteria, or a decision-ready shortlist, start with the diagnostic or a scoped pilot-evaluation engagement.
These pages may be useful: