How nutrition data is derived in Nibblr

Every nutrient value in Nibblr can be traced to a published government food composition database. This page documents how we ingest those sources, normalise them to UK regulatory standards, and reconcile differences between them. Where we have known limitations, we name them.

Where the data comes from

CoFID 2021

UK Food Standards Agency / Public Health England. Open Government Licence v3.0. ~2,887 foods.

Swedish Livsmedelsdatabasen 2026

Livsmedelsverket (Swedish Food Agency). Open data terms. ~2,575 foods.

Norwegian Matvaretabellen 2026

Mattilsynet (Norwegian Food Safety Authority). CC BY 4.0. ~2,121 foods.

Canadian Nutrient File

Health Canada. Open Government Licence (Canada). ~5,690 foods.

Staged but not yet promoted: Frida v5.5 (DTU, Denmark, CC BY 4.0). Legacy McCance & Widdowson rows are present where not yet superseded by the current CoFID equivalent. Nibblr-curated entries fill foods that are absent from any government source.

Each ingredient carries its source name and external reference (e.g. "CoFID 2021, food code 11-678"), retrievable via API and citable in published work.

We exclude USDA. American mandatory fortification of grains (folic acid, iron, thiamin, riboflavin, niacin) and the broader American food universe make USDA values misrepresent the UK food supply for our users; we restrict to UK / EU / EEA / Commonwealth databases.

See the full source register →

The unit we work in

All values are reported per 100 g of edible portion (UK FIC convention).

Salt vs sodium. Salt is derived from sodium where the source reports sodium only:

salt (g) = sodium (mg) × 2.5 / 1000

Where the source reports both, we cross-validate; disagreements greater than 10% are flagged for manual review, not silently corrected. Sodium is stored in milligrams per 100 g (McCance and FIC convention); salt is shown to users in grams.

Display rounding follows UK FIC Annex XV.

Energy

UK FIC Atwater factors:

Source-aware carb factor. Legacy McCance and any Nibblr-curated records use the historic UK 3.75 kcal/g (16 kJ/g) for available carbohydrate, reflecting how those values were originally derived. All other live sources use 4 kcal/g (17 kJ/g) per FIC. We disclose this because failing to handle it is a common quiet error in pipelines that mix sources.

kJ is calculated independently of kcal — not by 4.184 conversion — because the underlying factors differ by 1–2 kJ per 100 g per macronutrient and silently converting one to the other accumulates drift across complex recipes.

Source-reported energy is preserved alongside the recalculation. Differences greater than 20% flag the row for review.

Available carbohydrate

EU FIC requires available carbohydrate (mono- + disaccharides + polysaccharides excluding fibre). Some North-American-origin records use carbohydrate by difference (100 − water − protein − fat − ash − fibre − alcohol) instead.

We compute available carbohydrate as carbs − fibre at the transform layer. Where fibre is missing, we default it to 0 and flag the row with an assumed-zero indicator. Free sugars are a distinct sub-component, handled in §9.

Vitamins and equivalents

Vitamin A is normalised to RAE (Retinol Activity Equivalents per IOM 2001 / EFSA 2015):

1 µg RAE = 1 µg retinol = 12 µg β-carotene = 24 µg α-carotene / β-cryptoxanthin

Vitamin E. Where sources provide tocopherol fractions (Canadian: α / β / γ / δ), they are summed into vitamin_e_mg. Other sources report a single value.

Folate. Stored as total folate (µg). The user-facing label is "Folic acid" per FIC NRV vocabulary — the regulation uses the synthetic-form name for the NRV, and we mirror the regulation rather than the chemistry.

Cross-source reconciliation

Each source ingredient is normalised: lowercased, processing state extracted (raw / boiled / fried / etc.), form extracted (lean only, kernel only, etc.). Three matching strategies are layered:

  1. Exact match on normalised name.
  2. Trigram fuzzy similarity via PostgreSQL pg_trgm.
  3. Semantic vector embeddings for meaning-level equivalence.
  4. Manual review for borderline cases below the auto-match threshold.

Matched ingredients form a composite record. The composite inherits its primary value from the highest-priority source for the user's context (CoFID first for UK users); other sources gap-fill missing nutrient cells.

Provenance is preserved: every gap-filled value records which source supplied it, retrievable per ingredient via the API. Where sources disagree beyond tolerance, the composite uses the primary value and surfaces the disagreement rather than averaging them.

Quality gates

Five gates at ingestion, plus transform-layer validation.

Identity

Source + external reference + name must be present, else rejected.

FIC-7 mandatory

At least 4 of {energy, fat, saturated fat, carbohydrate, sugars, protein, salt or sodium}, else rejected. All 7 → "complete"; 4–6 → "partial".

Biological range

Each value bounded (energy 0–900 kcal, fat 0–100 g, sodium 0–40,000 mg, etc.). Out-of-range values are flagged, not rejected.

Internal consistency

Saturated ≤ total fat; sugars ≤ carbohydrates; computed energy within 10% of source-reported.

Deduplication. Records are unique on (source, external reference); a newer version of the same record updates in place rather than duplicating.

Transform-layer validation additionally covers negatives, macro closure, name/nutrient mismatch, per-100 ml detection (liquids accidentally reported on volume basis), free-sugar completeness, and energy plausibility.

Of approximately 14,512 ingredient records currently held, 15 are flagged for manual review (cited as evidence the gates work and matter).

Recipe-level calculation

Recipe per-100 g is the weighted average of each ingredient's per-100 g, weighted by mass fraction of total raw mass.

Yield (recipe-level or per-ingredient %) converts raw mass to finished mass:

Yield is clamped to ≥1% to prevent division collapse. Per-serving = per-100 g × (serving_grams / 100). Fruit / veg / nut / seed percentage (FVNS%) is computed for HFSS scoring with the dried-fruit weighting prescribed by the 2018 model.

Free sugars

Definition per SACN 2015: added mono- and disaccharides plus those naturally present in honey, syrups, fruit juices, and purées. Intrinsic sugars in whole or cellularly-intact fruit and vegetables are excluded.

Logic:

Classification rules and every unclassified → classified transition are reviewed by a registered nutritionist before being promoted to production.

Claims, NPM and HFSS

15 nutrition claim types per Reg (EC) No 1924/2006 Annex evaluated:

"Source of" and "high in" vitamin or mineral: 15% / 30% of NRV per 100 g, against EU 1169/2011 Annex XIII Part A.

The engine flags qualifying claims; it does not assert the legality of any specific marketing statement, which depends on conditions of use (claim wording, comparator product, target population) outside the threshold check.

NPM and HFSS. Both UK 2004/05 and 2018 Department of Health Nutrient Profiling Models are implemented and validated against all 10 GOV.UK published worked examples. Fibre is normalised to AOAC for scoring (NSP × 1.33 → AOAC, the EFSA-recognised conversion). The protein-points exclusion rule (A ≥ 11 and FVN < 5 → protein zeroed) is applied per the original DH guidance.

Allergens

Nibblr classifies the UK 14 mandatory allergens per FIC Annex II: cereals containing gluten, crustaceans, eggs, fish, peanuts, soybeans, milk, tree nuts, celery, mustard, sesame seeds, sulphur dioxide and sulphites at >10 mg/kg, lupin, molluscs.

Classification combines:

All inferred classifications are tagged with confidence "inferred"; promotion to production allergen status requires nutritionist review. The system never silently asserts allergen-free for ingredients it cannot classify with high confidence — ambiguous cases stay flagged rather than guess.

What this page doesn't claim

How to verify, how to push back

Every ingredient in the API exposes source and external_ref. Use them in academic, regulatory, or label-substantiation work. The composite-ingredient endpoint returns gap-fill provenance per nutrient, so a citation can be made specific to which source supplied which value.

Salmon, raw — Nibblr ingredient ID 4123
Primary source: CoFID 2021 (UK FSA, Open Government Licence v3.0)
Source ref: food code 11-678
Vitamin D: 8 µg/100g (gap-filled from Norwegian Matvaretabellen 2026, food code 0445)
Retrieved: 2026-04-15