Mapping the Invisible

How COMETS, STARS and Nanobank Illuminate Innovation's Hidden Pathways

The Data Desert in Innovation Science

Imagine trying to predict weather patterns without satellite imagery, or tracking disease outbreaks without global reporting systems. For decades, this was the reality for researchers studying how scientific discoveries transform into innovations that shape our economy and society. The critical shortage of comprehensive, linked data has been the single greatest impediment to advancing the science of innovation policy and practice (SciSIPP). As one landmark paper bluntly stated: "Data availability is arguably the greatest impediment" 1 3 .

Data Challenge

Before these systems, innovation research relied on fragmented, disconnected datasets that made comprehensive analysis nearly impossible.

Breakthrough

The integration of patents, publications, grants and firm data created the first complete picture of innovation pathways.

The Innovation Tracking Trinity: Nanobank, COMETS, and STARS Explained

Nanobank
The Proof-of-Concept Pioneer

Born from NSF grants in the early 2000s, Nanobank pioneered the approach of integrating disparate innovation records. Focused exclusively on nano-scale sciences, it became the prototype for what would follow.

COMETS
The Public Observatory

Short for Connecting Outcome Measures in Entrepreneurship Technology and Science, COMETS expanded Nanobank's approach to cover all sciences, technologies, and high-tech industries.

STARS
The Secure Vault

The Science and Technology Agents of Revolution System (STARS) serves as the restricted-access parent database to COMETS. Containing sensitive data at the individual scientist-inventor-entrepreneur level.

Comparison of Innovation Database Ecosystem

Feature Nanobank COMETS STARS
Scope Nano-scale sciences only All sciences/technologies All sciences/technologies
Access Public Public Restricted (NBER/UCLA)
Data Levels Organization-level Organization-level Individual + Organization
Key Sources Nano patents, publications, grants Patents, grants, publications, dissertations All COMETS data + proprietary sources
Geographic Coding Country/State Country/State/County/City Country/State/County/City

The Disambiguation Revolution: How Innovation Gets Its ID Card

The revolutionary power of these databases lies not just in the data they contain, but in how they connect it. Consider this challenge: When "J. Smith" appears on a Stanford patent, an MIT paper, and an NIH grant, are these the same person? Traditional databases would treat them as separate entities. The COMETS/STARS team tackled this through a massive disambiguation effort that assigns unique identifiers to organizations and individuals 1 6 .

The Identity Resolution Process:

Entity Extraction

Harvest names from patents, grants, publications, and firm records

Attribute Clustering

Group records by name variants, institutional affiliations, and specialty keywords

Geographic Anchoring

Pin each record to specific locations using address data

Cross-Verification

Check against authoritative directories (university faculty, corporate registries)

Unique ID Assignment

Create persistent identifiers that track entities across systems and time

Geographic Innovation Analysis Capabilities
Geographic Level Analysis Capabilities Sample Research Applications
Country Cross-national comparisons R&D investment effectiveness
Region Multi-state economic areas Tech cluster development
State Policy impact studies Tax policy effects on star scientist migration
County Regional innovation systems University knowledge spillovers
City Urban innovation districts Startup incubator effectiveness

Real-World Revelations: From Tax Policy to Tech Clusters

The true test of any scientific tool lies in what discoveries it enables. COMETS and its sibling databases have powered research that overturned conventional wisdom about innovation economics:

The Star Scientist Migration Project

Using STARS data tracking individual scientists, researchers analyzed how state taxes affect geographical decisions of top innovators 2 4 .

The Nanotech Explosion Map

Using Nanobank's specialized tracking, researchers identified the critical role of academic pioneers in nanotechnology commercialization 1 .

The Patent Quality Paradox

By linking patent data with university records through COMETS, researchers discovered that academic patents with industry collaboration were cited significantly more frequently 2 .

Key Research Findings Enabled by COMETS/STARS/Nanobank

Research Focus Database Used Key Finding Policy Impact
Star scientist migration STARS Top scientists highly responsive to tax incentives States redesigned innovation tax credits
Nanotech firm formation Nanobank Academic stars drive regional firm entry Targeted university research investment
Patent quality COMETS Academic-industry collaboration produces highest impact patents Enhanced support for partnership programs
Innovation clusters STARS/COMETS Critical mass occurs at ~50 star scientists Cluster development strategies refined

The Scientist's Toolkit: Inside the Innovation Database Engine

What makes these databases so powerful? A suite of specialized tools and approaches that transform raw data into innovation intelligence:

Disambiguation Algorithms

Sophisticated name-matching systems resolving "J. Smith" across patents, papers, and grants using institutional affiliations, co-author networks, and specialty keywords 5 6 .

Zucker-Darby Classification System

Proprietary taxonomy categorizing patents and grants into precise science/technology areas, enabling field-specific analysis 5 .

Ticker/CUSIP Linkage

Financial identifier integration allowing researchers to connect innovation activities directly with corporate performance data 1 3 .

Longitudinal Tracking

Specialized data structures maintaining entity identities across decades despite name changes, mergers, or address changes 4 .

Beyond the Data: Shaping Tomorrow's Innovation Landscape

As COMETS evolves, its creators envision a future where innovation tracking becomes as precise as weather forecasting. Current developments include:

Real-time updating

Reducing data lag from years to months

Expanded global coverage

Moving beyond US-centric data

Machine learning enhancements

Predicting emerging tech hotspots

Policy simulation modules

Modeling innovation policy impacts before implementation 6

"COMETS transforms innovation research from disconnected case studies into a replicable science."

Innovation researcher 3 5

References