Decision-making driven by data rather than emotion is the goal of every organization. Often, data presented to senior management are dated and unintentionally biased due to the nature of the organization’s reward. Research and development organizations constantly “push” their inventions to a commercial organization, while commercial groups search for products that meet real or perceived customer needs. Bringing both groups together has been a perennial challenge.
To address the gap between timely data analysis and business decision-making, CIMS is leading a multifaceted program with IBM’s Emerging Software Technologies team (jStart) and NC State’s Office of Technology Transfer (OTT) and its powerful Virtual Computing Lab team.
Our goal is to prove that this gap can be filled by finding and literally “reading” massive amounts of unstructured data that can only be found on company websites, blogs and social media sites located on the Worldwide Web. The application we chose to prove is common to every technology organization — academic or industrial — i.e., that of finding viable partners to help commercialize its inventions.
CIMS has been collaborating with Alan Porter, Professor Emeritus (Georgia Tech) for over 10 years to integrate his “Tech Mining” approach into CIMS research projects.
Over that time we have developed a number of applications to help companies, and governments, locate a particular technology’s “center of gravity,” i.e., where (what research center) and who (which scientists) are the top experts in the field. (See Alan’s latest report immediately above, as well as “Tech Mining Facilitates Open Innovation” and “New Tools for Open, Collaborative Innovation” (CIMS TMR Springand Fall 2008, respectively).
Tech Mining exploits field-structured data resources (e.g., Derwent World Patent Index, Web of Science, Factiva, etc.) to inform management. Searches are performed on these databases; electronic documents are retrieved (usually abstract records); the data are cleaned and analyzed, and the findings represented in both tabular and graphical formats that CIMS calls Science Maps.
The project conducted with IBM and NC State’s OTT would now search in the opposite direction, i.e., which companies, and specifically which people inside these organizations, are likely in need of a particular technological capability?
This search would be harder because the source of this information is not found in structured databases but in SEC filings, conference proceedings, joint venture announcements etc. A fairly complex search argument would need to be constructed along with new software algorithms that can literally “read” the web, and the power of a supercomputer delivered via the cloud.
The NC State OTT and CIMS have provided the jStart team with a list of primary keywords related to two specific NC State-owned technologies — both in the area of pharmaceutical products. The keyword hits were further refined based upon other secondary keywords or phrases that were contained within either the same sentence or paragraph to find strong or weak hits. Data sources accessible via the web were crawled and a “meta dataset” of over 1 million records created.
The data set will now be maintained by the OTT and used by out-licensing professionals to answer any number of future queries. And the OTT never has to worry about the data set becoming “dated” because each record is automatically updated every time there is a change to one of the corresponding web pages (For more information about this application, visit IBM.com and see http://www-01.ibm.com/software/ebusiness/jstart/portfolio/ncsuCaseStudy.pdf .)
A Suite of Applications
The successful application of these powerful algorithms for NC State’s out-licensing professionals is just the start. We envision an entire suite of applications — delivered to organizations as a software service — from NC State’s own cloud computing environment, the Virtual Computing Lab.
This will bring the power of a supercomputer to the desktops of heads of R&D, business development and market analysts — when they need it. Moreover, it will allow the analysts to quickly search, interpret, synthesize, and graphically display the intelligence (patterns) gleaned from massive amounts of data.
Of course, at NC State our primary goal is to get these applications into the hands of faculty and students. They would be able to test the commercial viability of their ideas quickly and efficiently early in the technology development process — potentially saving years. They could also use these tools for locating potential companies and investors interested in funding their research programs.
When we are together at the Spring Sponsors Meeting (May 25-27 in Raleigh), I will brief you on the full suite of applications planned and other proof of concept projects already underway.
Thanks to CIMS Fellows
I would be remiss if I did not say that none of this would have possible without the deep domain experience of two CIMS Industrial Fellows: Michael (Mike) Kowolenko and Richard (Dick) Kouri. As mentioned, we chose technologies and real-life challenges from the pharmaceutical industry for the proof of concept.
In addition to being a SVP, Biotech Technical Operations and Product Supply at Wyeth, Mike was also SVP, Pharmaceutical Operations and Technology at Biogen Idec. And Dick, as many of you know, was a serial entrepreneur in the bioscience industry with over 11 successful launches and four exits totaling $878 million to his credit.
While the solution they have created is generic to all industries, I doubt that CIMS could have accomplished its initial exploration into the world of unstructured data without their talents.
If there are any questions I can answer about this, or any other CIMS activity, please don’t hesitate to email me at Paul_Mugge@ncsu.edu