Product-Level AI-Derived Indicators Database for International Trade
Empirical research in international trade increasingly relies on product-level panel data. Granular bilateral trade flows at the HS 6-digit level are now available for most countries and years. Yet the indicators used to characterize those products have not kept pace. Classical measures like the Rauch (1999) classification were defined for older nomenclatures such as SITC Rev. 2 and have never been updated. Using them with modern HS-based trade data requires concordance tables that introduce noise, lose coverage, and often operate at much coarser levels of aggregation than the original. Other product attributes—perishability, hazardousness, dual-use potential, semiconductor content—have never been systematically classified at the HS 6-digit level at all.
PLAID addresses this gap with a replicable pipeline in which large language models perform the same classification tasks that human experts once performed—but scalably, consistently, and directly at the HS 6-digit level for any revision. Each indicator is classified independently by an ensemble of four frontier LLMs. Aggregating predictions across models yields scalable labels together with natural measures of uncertainty. Because the method operates directly on product descriptions, it eliminates the need for concordance tables and can be applied consistently across all HS revisions since 1992.
The current beta provides full HS 6-digit coverage across all major revisions since 1992 for six indicators.
Six dimensions of product classification
Classifies goods by price-formation institutions into organized-exchange (w), reference-priced (r), or differentiated (n). Exchange goods have standardized contracts on major commodity exchanges with public prices. Reference-priced goods have widely published benchmark prices. Differentiated goods depend on brand, design, and specifications.
Replicates the SNA end-use dimension of the UN BEC framework. Capital goods are used in production over multiple periods. Intermediate inputs are consumed in production processes. Consumption goods are purchased by households for direct use.
Measures how quickly goods lose economic value over time on a five-class scale. Ultra-perishable products (class 1) lose value within days; non-perishable products (class 5) retain value for decades. Captures physical spoilage, regulatory expiry, seasonal obsolescence, and technological obsolescence.
Two separate boolean indicators. Hazardous: classified under the Globally Harmonized System (GHS) or subject to dangerous goods transport regulations. Dual-use: legitimate civilian applications but potential military, weapons, or surveillance uses per the Wassenaar Arrangement.
Whether the product contains, embeds, or is a semiconductor as a functional component. Covers integrated circuits, finished electronics, vehicles with ECUs, medical devices with microcontrollers, and industrial machinery with PLCs.
Whether the product contains tin, tantalum, tungsten, or gold. Regulated under EU Regulation 2017/821 and US Dodd-Frank Section 1502. Identifies the specific mineral type.
Search and browse HS6 product classifications
| Code | Description | Rauch | BEC | Perish. | Hazard. | Micro. | 3TG |
|---|
PLAID v0.1 beta database — 42 files, CC BY 4.0
PLAID data is available as a static JSON API. Each product has its own endpoint containing consensus classifications and per-model reasoning.
Product detail:
curl https://plaid.julianhinz.com/api/v0.1/H6/010121.json
Full product index (all products for a revision):
curl https://plaid.julianhinz.com/api/v0.1/H6/index.json
{
"code": "010121",
"revision": "H6",
"description": "Horses; live, pure-bred breeding animals",
"indicators": {
"rauch": {
"consensus": { "value": "n", "shares": { "w": 0, "r": 0, "n": 1 } },
"models": {
"Mistral": { "value": "n", "confidence": 0.92, "reasoning": "..." }
}
}
}
}
Each HS6 product is independently classified by four large language models via the OpenRouter API. The prompt includes the product's HS6 code, full description, and chapter-level context from the official HS nomenclature. The structured prompt ensures each model receives identical information, producing classifications that are comparable across models.
Multi-model ensemble. To reduce model-specific bias and provide uncertainty quantification, each product is independently classified by four frontier LLMs. The final classification is the majority vote across models. Per-category shares and standard deviations quantify model agreement, providing natural uncertainty measures. Products where models disagree can be flagged for manual review or further investigation.
Product descriptions are sourced from the UN STATS HS nomenclature.
Help us improve PLAID
PLAID is a beta release and we welcome feedback from the research community. If you notice a misclassification, have suggestions for new indicators, or want to report an issue, please get in touch.
Send feedback to tradepolicy@kielinstitut.de
You can also suggest corrections for individual products using the "Suggest a correction" button on each product detail page.