As a technology manager you confront a spectrum of issues: prioritizing your R&D project portfolio, benchmarking your research productivity, checking whether a key competitor is pursuing a particular development, or finding the right partner to commercialize a new product. Alan L. Porter, Search Technology, Inc. and Georgia Tech, reports that there has been notable progress over the past decade in what empirical analyses can contribute to such decisions.
“Tech Mining” is the name for such analyses of Science, Technology & Innovation information resources to provide managers with usable technical intelligence 1, 2. The intent is to help technology managers adeptly address “who, what, when, and where?” questions. In this short piece, Porter illustrates the sorts of answers that Tech Mining can provide, focusing on “who and what” questions concerning one timely emerging technology.
Here’s his account:
In June 2009, nine of us met at the North Carolina Biotechnology Center to initiate an R&D data mining project for the N.C. Center of Innovation for Nanobiotechnology (“Nanobiotech-COI”). The aim was for CIMS to provide Center director Brooks Adams with an enduring resource that would help him connect researchers and companies pursuing innovation in the application of nanotechnology to drug delivery.
We intended to accomplish this by gathering information from research publications to provide an overall “research landscape.” More important, results would be searchable so that COI staff could identify highly active researchers pursuing topics that address specific interests.
The first step here is to identify the relevant articles; the second is to profile both sides — 1) the active academic research groups and 2) the companies pursuing research in the area
Searching Research Papers
We searched for pertinent research publications within a set of over 500,000 nanorelated publication abstracts from the Web of Science (WOS,). Kam Leong of Duke University, with Brooks Adams, Douglas K.R. Robinson (Institute of Nanotechnology, Glasgow) and Haico Tekulve (University of Twente) helped us refine a search algorithm to capture papers relating to “drug delivery.”
We assessed a variety of specific search terms, augmented by papers that cited work considered directly related to drug delivery. Our final search consisted of 21 phrases that captured abstracts with certain key phrases in proximity.
The two most prominent phrases give the flavor (details available on request.). Bear in mind that these searches are within a nanorelated set, in turn, based on a multi-part inquiry:
• Deliver* [within 2 words of] (gene or DNA or drug* or pharmac*).
• Control* [followed by, within 2 words] releas*.
We adapted this algorithm to search WOS for additional 2008 and 2009 publications. After consolidating and cleaning the data, we were left with 33,995 records covering the period 1998-2009. (Data for 2009 are not complete; the search, dated Nov. 24, might be estimated at ~70% of the eventual full year’s records.) Using VantagePoint software (www.theVantagePoint.com) we cleaned various fields to consolidate name variation and help generate analyses and visualizations. For example, we created a “key terms” field from two keyword fields and title phrases.
What We Learned
We first addressed “who?” questions. Which countries stand out in this research (we could also call that a “where” question)? Which research organizations are most active in this field? Next we addressed combined where/who/what: which organizations are prominent in North Carolina — and on which topics are they focusing?
We found that the leading countries (in which any of an article’s authors reside) are the United States (33%) and China (14%). The latter is somewhat surprising, but much in line with China’s increasingly strong basic research presence in most fields these days. The leading organizations publishing in international journals are the Chinese Academy of Sciences (796 of the 33,995 papers) and the National University of Singapore (425). The leading U.S. organizations are Washington University (415), Harvard (379) and MIT (349).
Of special interest to Nanobiotech-COI are active researchers in North Carolina. The table above profiles the three standout universities. Here we’ve chosen to break out information on existing corporate research collaborations, leading nano-enhanced drug delivery researchers, keywords (indicative of topical emphases), and how active the organization has been recently.
Who’s Doing What
On the flip side, the Nanobiotech-COI wants to know which companies are engaging nano-enhanced drug delivery. In extending this technical intelligence, we would mine patents as a key information resource on company development pursuits, but in this exercise we focus on WOS publications (recognizing that company publication policies vary greatly). We probe company activity at different levels. For one, we are interested in “anywhere” corporate activity. Globally, here are the “Top Ten” research publishers in WOS:
• Novartis (48 publications)
• Pfizer (44)
• Merck (42)
• Japan Science & Technology Corp. (41)
• GlaxoSmithKline (“GSK”)(31)
• Johnson & Johnson (28)
• IBM (27)
• Chemicell GmbH (21)
• Dow Chemical (21)
• Abbott Labs (21)
• GKSS Forschungszentrum Geesthacht GmbH (21)
While “big pharma” dominates the list, note the interesting others — IBM, for instance. Might these be potential partners?
We further offered general breakout data in MS Excel for each company (e.g., key topics emphasized, notable collaborations, extent of recent activity). The data can respond further to targeted inquiries in VantagePoint (e.g., which researchers are working on topic X in company Y?).
We next zoom in on companies located in North Carolina. These may offer initial contacts for Brooks Adams and colleagues to engage on topics of mutual interest. For instance, we broke out a table showing 22 NC-based research groups. For each we listed collaborating organizations (e.g., GSK-North Carolina works with UNC), key terms, researchers, and so on.
Network maps help Nanobiotech-COI recognize existing relationships to build upon, or sidestep, as the case warrants. Figure 1 identifies collaborations among leading NC organizations. Corresponding global or U.S. network maps can be readily generated on demand. Sometimes it’s handy to map all the collaborations of one organization (e.g., GSK, to help prepare for a meeting with them).
Addressing the “What” Questions
To illustrate how we can address “what”questions (topical foci), suppose the Nanobiotech Center of Innovation is to meet with a Duke University research group to facilitate tech transfer. They first learn about the Duke research emphasis by profiling their research outputs (by analyses such as those illustrated here) and discussion with one or more of the researchers.
Let’s assume the group is researching “chitosan-DNA nanoparticles.” We then scan our global dataset and find 20 companies with one or more publications with a chitosan-related key term. We introduce this list, with suitable breakout information, as Nanobiotech-COI meets with the Duke group to explore open innovation opportunities.
Sometimes you want to determine the impact of a line of research. Text mining can probe the citation patterns of articles and/or patents and spot particularly influential R&D outputs, as Figure 2 illustrates.
Focusing on highly-cited publications can help anticipate likely “next steps” offering special opportunity. For example, sorting through our 34,000 nano-enhanced drug delivery records, we find that two have received 1,000 or more citations: “Interpreting patterns of gene expression with self-organizing maps,” and “Semiconductor nanocrystals as fluorescent biological labels.”
We could probe these further to profile which organizations are picking up on each line of inquiry (those citing these articles) and whether these findings are spawning new research areas, providing new techniques of potential utility in one’s own R&D, and so on.The management challenge is to discern how our organization’s interests might be advanced by active engagement with such advances.
Start with the Questions
I have illustrated only a few of the possible findings about nano-enhanced drug delivery here. The Tech Mining approach emphasizes starting with the managerial questions needing answers — not with the data. Then go after that technical intelligence via empirical means.As an R&D manager, what might you want to know about research activity (and patent and business activity as well)? You might want to benchmark your activities against those of a key competitor. Or, possibly profile the efforts of a particular university to determine if you see special value in exploring partnership opportunities, and of what sort, with whom.
Profiling a research field in the ways illustrated here can help orient and leverage one’s own R&D efforts more fully. Tech Mining is an essential tool to enable open innovation — i.e., to extract full value from one’s own R&D by licensing or otherwise engaging others, and, conversely, by exploiting others’ research toadvance one’s own product development.
References and Notes
1. Porter, A.L., and Cunningham, S.W. 2005. Tech Mining: Exploiting New Technologies for Competitive Advantage, Wiley, New York.
2. Porter, A.L., Youtie, J., Shapira, P., and Schoeneck, D.J. 2008. Refining Search Terms for Nanotechnology, Journal of Nanoparticle Research vol. 10, no. 5, pp. 715-728.
3. Porter, A.L. Tech Mining Facilitates Open Innovation.2008. CIMS Technology Management Report, Spring, p.3.
4. Web of Science (“WOS”) is the world’s leading compilation of fundamental research article abstracts, covering some 10,000 premier journals; see: www.isiknowledge.com.
Alan L. Porter
Search Technology, Inc.
Georgia Tech University