Simon
Full pipeline20260519T202149Z

Food Contaminants and Cross-Contamination

May 19, 2026, 20:21 UTC·17 ok · 7 skipped · 0 failed

Language pairs
17
Flagged segments
55
of 74 scored
Errors found
115
5461
Corrected
32
System xCOMET
0.5210.549
triage mean · before ▸ after
Threshold
0.80
default
ModelsSegmentation: us.anthropic.claude-sonnet-4-6Scoring (xCOMET): xcomet-lite-endpoint (SageMaker)Error detection: us.anthropic.claude-opus-4-7Rewriting: us.anthropic.claude-opus-4-7

Manifest rebuilt from logs (original_pipeline_log, per_pair_pipeline_diff, es419_resume) — per-file counts are derived from the per-pair output.

en-es-419 Originally failed on Stage 3 regex (es-419 digits); resumed and completed via --start-stage 3 after fix

Language pairs

Severity
Amharicen-am
47
34
2
0.5310.539+0.008
Burmeseen-my
410
37
3
0.5570.572+0.016
Chinese (Simplified)en-zh-CN
48
26
2
0.6850.706+0.021
Khmeren-km
411
74
4
0.4160.515+0.099
Somalien-so
411
65
1
0.2930.287-0.006
Swahili (Kenya)en-sw-KE
47
52
2
0.4040.397-0.007
Swahili (Tanzania)en-sw-TZ
410
64
2
0.4040.413+0.008
Vietnameseen-vi
47
25
3
0.6680.698+0.030
Arabicen-ar
36
24
2
0.5820.615+0.034
French (Canada)en-fr-CA
33
21
2
0.7090.734+0.025
Hindien-hi
35
23
3
0.5750.639+0.065
Portuguese (Brazil)en-pt-BR
38
44
1
0.7330.742+0.009
Spanish (Latin America)en-es-419
36
42
1
0.6690.726+0.057
Tagalogen-tl
37
34
3
0.4160.468+0.051
Ukrainianen-uk
35
14
0
0.5340.5340.000
Frenchen-fr
24
22
1
0.6840.755+0.071
Englishen-en
00
0
0.0000.0000.000

Skipped — no xCOMET coverage

7
HmonghmnHaitian CreolehtS'gaw KarenkswMarshallesemhOromoomKinyarwandarwTigrinyati

These target languages have no reliable xCOMET-lite metric, so the pipeline skipped them before scoring.