We Built an AI-Powered NAICS Code Lookup Tool in an Afternoon — Here's Why That Matters
A utilities client had stale business classification data driving wrong power restoration priorities. We built a tool that queries seven public data sources simultaneously and uses AI to return a confidence-rated NAICS code. Here's how.
There's a class of problem that exists inside every large organization that nobody talks about publicly — the operational data quality problem. The data that drives decisions is stale, inconsistently maintained, and nobody has the bandwidth to fix it systematically. It's not glamorous. It's also genuinely expensive.
One of our consulting clients recently surfaced exactly this kind of problem. They're a utilities company responsible for power restoration after outages. The sequence in which power gets restored isn't random — it follows a priority system tied to business type. Hospitals before restaurants. Fire stations before retail. The business type is classified by a NAICS code. And the NAICS code associated with each service address in their system is frequently wrong.
The scenario they described: a building that previously housed an urgent care clinic has a new tenant — a restaurant. The urgent care moved out two years ago. The NAICS code in the system still says urgent care. So the restaurant gets priority restoration it isn't entitled to, while actual critical facilities wait longer than they should. Multiply that across thousands of accounts across three states and you have a systemic data quality problem with real consequences for real people.
The ask was simple: can we build something that helps their customer service reps look up and validate the correct NAICS code for any business, by name or EIN, quickly and without a lot of manual research?
We built it in an afternoon.
The architecture
The tool runs on Cloudflare Pages with Cloudflare Functions handling all server-side logic. When a CSR enters a company name, it fires seven public data sources simultaneously — not sequentially, in parallel:
- SEC EDGAR — public company annual filings
- SAM.gov — federal contractor registrations (which always include self-reported NAICS codes)
- NPPES — the CMS registry of every licensed healthcare provider in the United States
- USASpending.gov — federal award contracts, each with a self-reported NAICS code
- OpenStreetMap — catches fire stations, police departments, airports, TV stations by facility type
- CMS Care Compare — nursing homes and dialysis centers across all three states
- Web scrape fallback — if all six structured sources return nothing, the tool locates and scrapes the company's website
Everything goes to Claude, which synthesizes the results, cross-references sources, and returns a NAICS code with a confidence level and source citation. When multiple sources agree on the business type, confidence is flagged as High. When it's working from web content alone, it's Medium. When it's guessing from a company name, it's Low and the CSR knows to verify.
The entire stack costs effectively nothing to run. Cloudflare Functions have a generous free tier. Every data source is public and free. Claude API calls at this volume are pennies.
Why this is interesting beyond the specific use case
NAICS codes are one example of a broader category: authoritative classification data that organizations need but don't maintain well. The same pattern applies to business license verification, contractor classification, healthcare provider credentialing, tax classification, and dozens of other compliance and operational data problems.
What we built here isn't really a NAICS lookup tool. It's a pattern for building low-cost, high-accuracy classification tools by triangulating across multiple public data sources and using AI to synthesize the result. The confidence scoring approach — where agreement across sources drives High confidence — is reusable for any problem where multiple weak signals combine into a strong one.
This is what consulting at the intersection of AI and operations actually looks like. Not a slideware strategy. A working tool, deployed, solving a real problem, in an afternoon.
2057 Holdings is an Edmond, Oklahoma-based holding company operating six portfolio companies across technology, education, and professional services. Our consulting practice builds custom AI infrastructure for operators who need real solutions, not vendor pitches.
Read the technical breakdown on Jesse Myers' blog: Building an AI NAICS Lookup Tool: Seven Sources, One Answer
Safire Business Services wrote about what this means for enterprise data quality strategy: The NAICS Problem Is a Symptom — Your Business Data Has the Same Disease
Noevant covers how this type of tool fits into an AI-powered operational stack: One Afternoon, Seven APIs, One Answer: The Composable AI Tool Pattern