3,303
Views
37
CrossRef citations to date
0
Altmetric
Special Symposium Articles

Uncovering text mining: A survey of current work on web-based epidemic intelligence

Pages 731-749 | Received 20 Oct 2011, Accepted 06 Mar 2012, Published online: 11 Jul 2012

References

  • Berry , M.W. and Kogan , M. 2010 . Text mining: applications and theory , Edison , NJ : Wiley .
  • Brownstein , J. , Freifeld , C. , Reis , B. and Mandl , K. 2008 . Surveillance san frontières: Internet-based emerging infectious disease intelligence and the HealthMap project . Public Library of Science Medicine , 5 ( 7 ) : 1019 – 1024 .
  • Buckeridge , D. , Burkom , H. , Campbell , M. , Hogan , W.R. and Moore , A.W. 2005 . Algorithms for rapid outbreak detection: a research synthesis . Journal of Biomedical Informatics , 38 ( 2 ) : 99 – 113 .
  • Chanlekha , H. , Kawazoe , A. , and Collier , N. , 2010 . A framework for enhancing spatial and temporal granularity in report-based health surveillance systems . BMC Medical Informatics and Decision Making , 10 1 , e43 .
  • Chaudet , H. 2006 . Extending the event calculus for tracking epidemic spread . Artificial Intelligence in Medicine , 38 ( 2 ) : 137 – 156 .
  • Collier , N. 2010 . What's unusual in online disease outbreak news? . Journal of Biomedical Semantics , 1 ( 1 ) : 2
  • Collier , N. 2011 . Towards cross-lingual alerting for bursty epidemic events . Journal of Biomedical Semantics , 2 ( Suppl. 5 ) : S10
  • Collier , N. and Doan , S. , 2011 . Syndromic classification of Twitter messages . Proceedings of eHealth , 21–23 November , Malaga , Spain , arXiv:1110.3094 .
  • Collier , N. and Doan , S. 2012 . GENI-DB: A database of global events for epidemic intelligence . Bioinformatics , 28 ( 8 ) : 1186 – 1188 .
  • Collier , N. , Doan , S. , Kawazoe , A. , Matsuda Goodwin , R. , Conway , M. , Tateno , Y. , Ngo , Q. , Dien , D. , Kawtrakul , A. , Takeuchi , K. , Shigematsu , M. and Taniguchi , K. 2008 . BioCaster: detecting public health rumors with a Web-based text mining system . Bioinformatics , 24 ( 24 ) : 2940 – 2941 .
  • Collier , N. , Kawazoe , A. , Jin , L. , Shigematsu , M. , Dien , D. , Barrero , R. , Takeuchi , K. and Kawtrakul , A. 2006 . A multilingual ontology for infectious disease surveillance: rationale, design and challenges . Language Resources and Evaluation , 40 ( 3–4 ) : 405 – 413 .
  • Collier , N. , Kawazoe , A. , Shigematsu , M. , Taniguchi , K. , Jin , L. , McCrae , J. , Dien , D. , Hung , Q. , Takeuchi , K. , and Kawtrakul , A. , 2007 . Ontology-driven influenza surveillance from Web rumours . Proceedings on Options for the Control of Influenza VI (Options 2007) , 17–23 June , Toronto , Ontario , , Canada .
  • Collier , N. , Goodwin , R.M. , McCrae , J. , Doan , S. , and Kawazoe , A. , 2010 . An ontology-driven system for detecting global health events . Proceedings of the 23rd International Conference on Computational Linguistics (COLING) 23–27 August , Beijing , , China 215 – 222 .
  • Conway , M. , Doan , S. , Kawazoe , A. and Collier , N. 2009 . Classifying disease outbreak reports using n-grams and semantic features . International Journal of Medical Informatics , 78 ( 12 ) : e47 – e58 .
  • Corley , C.D. , Cook , D.J. , Mikler , A.R. and Singh , K.P . 2010 . Text and structure data mining of influenza mentions in Web and social media . International Journal of Environmental Research and Public Health , 7 : 596 – 615 .
  • Culotta , A. , 2010 . Detecting influenza outbreaks by analyzing Twitter messages . Southeastern Louisiana University Technical Report . Available from: arXiv:1007.4748v1 [cs.IR] [Accessed 25 July 2012] .
  • Damianos , L. , Ponte , J. , Wohlever , S. , Reeder , F. , Day , D. , Wilson , G. and Hirschman , L. 2002 . MiTAP for bio-security: a case study . AI Magazine , 23 ( 4 ) : 13 – 29 .
  • Eysenbach , G. 2002 . Infodemiology: the epidemiology of (mis)information . American Journal of Medicine , 113 ( 9 ) : 763 – 765 .
  • Fayyad , U. , Piatetsky-Shapiro , G. and Smyth , P. 1996 . From data mining to knowledge discovery in databases . AI Magazine , 17 ( 3 ) : 37 – 54 .
  • Feldman , R. and Sanger , J. 2006 . The text mining handbook: advanced approaches in analyzing unstructured data , Cambridge : Cambridge University Press .
  • Fuller , S. 2010 . Tracking the global express: new tools addressing disease threats across the world . Epidemiology , 21 ( 6 ) : 769 – 771 .
  • Ginsberg , J. , Mohebbi , M. , Patel , R. , Brammer , L. , Smolinski , M. and Brilliant , L. 2008 . Detecting influenza epidemics using search engine query data . Nature , 457 : 1012 – 1014 .
  • Grishman , R. , Huttunen , S. and Yangarber , R. 2002 . Information extraction for enhanced access to disease outbreak reports . Journal of Biomedical Informatics , 35 ( 4 ) : 236 – 246 .
  • Hartley , D. , Nelson , N. , Walters , R. , Arthury , R. , Yangarber , R. , Madoff , L. , Linge , Y. , Mawudeku , A. , Collier , N. , Brownstein , J. , Thinus , G. and Lightfoot , N. 2010 . The landscape of international event-based biosurveillance . Emerging Health Threats Journal , 3 : e3
  • Hearst , M. , 1999 . Untangling text data mining . Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics , 20–26 June 1999 , Maryland , , USA , 3 – 10 .
  • Hirschman , L. , Park , J.C. , Tsujii , J. , Wong , L. and Wu , C.H. 2002 . Accomplishments and challenges in literature data mining for biology . Bioinformatics , 18 ( 12 ) : 1553 – 1561 .
  • Humphreys , B. and Lindberg , D. 1993 . The UMLS project: making the conceptual connection between users and the information they need . Bulletin of the Medical Library Association , 81 ( 2 ) : 170
  • Hutwagner , L. , Thompson , W. , Seeman , M.G. and Treadwell , T. 2003 . The bioterrorism preparedness and response early aberration and reporting system (EARS) . Journal of Urban Health , 80 ( 2 ) : i89 – i96 .
  • Janson , B. and Spink , A. 2006 . How are we searching the World Wide Web? A comparison of nine search engine transaction logs . Information Processing and Management , 42 ( 1 ) : 248 – 263 .
  • Jones , E. , Patel , N. , Levy , M. , Storeygard , A. , Balk , D. , Gittleman , J. and Daszak , P. 2008 . Global trends in emerging infectious diseases . Nature , 451 : 990 – 993 .
  • Keller , M. , Freifeld , C.C. and Brownstein , J.S. 2009 . Automated vocabulary discovery for geo-parsing online epidemic intelligence . Bio Medical Central Bioinformatics , 10 : 385
  • Kosala , R. and Blockeel , H. 2000 . Web mining research: a survey . Association for Computing Machinery Special Interest Group on Knowledge Discovery and Data Mining Explorations , 2 ( 1 ) : 1 – 15 .
  • Lampos , V. and Cristianini , N. , 2010 . Tracking the flu pandemic by monitoring the social web . 2nd IAPR Workshop on Cognitive Information Processing (CIP 2010) , 14–16 June 2010 , Tuscany Italy , 411 – 416 .
  • Lin , S. and Ho , J. , 2002 . Discovering informative content blocks from Web documents . Proceedings of ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD) , 23–26 July 2002 , Alberta Canada .
  • Lowe , H. and Barnett , G. 1994 . Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches . Journal of the American Medical Association , 271 : 1103 – 1108 .
  • Lyon , A. , Nunn , M. , Grossel , G. , and Burgman , M. , 2011 . Comparison of Web-based biosecurity intelligence systems: BioCaster, EpiSPIDER and HealthMap . Transboundary and Emerging Diseases [E-publication ahead of print]. Available from: http://onlinelibrary.wiley.com/doi/10.1111/j.1865-1682.2011.01258.x/abstract [Accessed 25 July 2012] .
  • Madoff , L.C. and Woodall , J.P. 2005 . The Internet and the global monitoring of emerging diseases: lessons from the first 10 years of ProMED . Archives of Medical Research , 36 : 724 – 730 .
  • Mawudeku , A. and Blench , M. , 2006 . Global Public Health Intelligence Network (GPHIN) . Proceedings of the 7th Conference of the Association for Machine Translation in the Americas , 8–12 August , Cambridge , MA .
  • McCallum , A. and Li , W. , 2003 . Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons . Proceedings of the Seventh Conference on Natural Language Learning , 31 May–1 June 2003 , Edmonton , , Canada , 188 – 191 .
  • Nadeau , D. and Sekine , S. 2007 . A survey of named entity recognition and classification . Linguisticae Investigationes , 30 ( 1 ) : 3 – 26 .
  • Paquet , C. , Coulombier , D. , Kaiser , R. , and Ciotti , M. , 2006 . Epidemic intelligence: a new framework for strengthening disease intelligence in Europe . EuroSurveillance , 11 12 pii = 665 .
  • Polgreen , P.M. , Chen , Y. , Pennock , D.M. and Nelson , F.D. 2008 . Using Internet searches for influenza surveillance . Clinical Infectious Diseases , 47 ( 11 ) : 1443 – 1448 .
  • Price , C. and Spackman , K. 2000 . SNOMED clinical terms . British Journal of Healthcare Computing & Information Management , 17 ( 3 ) : 27 – 31 .
  • Rosse , C. and Mejino , J.L.V. , 2008 . The foundational model of anatomy ontology . In: A. Burger , D. Davidson and R. Baldock Anatomy ontologies for bioinformatics: principles and practice . London : Springer , vol. 6 , 59 – 117 .
  • Signorini , A. , Segre , A.M. and Polgreen , P.M. 2011 . The use of Twitter to track levels of disease activity and public concern in the U.S. during the influenza A H1N1 pandemic . Public Library of Science One , 6 ( 5 ) : 19467
  • Soergel , D. , Lauser , B. , Liang , A. , Fisseha , F. , Keizer , J. , and Katz , S. , 2004 . Reengineering thesauri for new applications: the AGROVOC example . Journal of Digital Information , 4 4 . Available from: http://jodi.ecs.soton.ac.uk/Articles/v04/i04/Soergel
  • Steinberger , R. , Flavio , F. , van der Goot , E. , Best , C. , von Etter , P. and Yangarber , R. 2008 . “ Text mining from the web for medical intelligence ” . In Mining massive data sets for security , Edited by: Fogelman-Soulié , F. , Perrotta , D. , Piskorski , J. and Steinberger , R. 295 – 310 . Amsterdam , , The Netherlands : IOS Press .
  • Swanson , D.R. 1986 . Fish oil, Raynaud's syndrome, and undiscovered public knowledge . Perspectives in Biology and Medicine , 30 ( 1 ) : 7 – 18 .
  • The Open Biomedical Ontologies (OBO) , 2011 . The open biomedical ontologies [online] . Available from: http://www.obofoundry.org/ [Accessed 25 September 2011] .
  • Tolentino , H. , Kamadjeu , R. , Fontelo , P. , Liu , F. , Matters , M. , Pollack , M. and Madoff , L. 2007 . Scanning the emerging infectious disease horizon – visualizing ProMED emails using EpiSpider . Advances in Disease Surveillance , 2 : 169
  • Torii , M. , Yin , L. , Nguyen , T. , Mazumdar , C.T. , Liu , H. , Hartlet , D.M. and Nelson , N.P. 2011 . An exploratory study of a text classification framework for Internet-based surveillance of emerging epidemics . International Journal of Medical Informatics , 80 ( 1 ) : 56 – 66 .
  • Vaillant , L. , Nys , J. , Gastellu-Etchegorry , M. , and Barboza , P. , 2011a . Enhancement of sensitivity with gathering Internet-based systems for early threat detection within the global health security initiative (GHSI): the EAR project . Proceedings of eHealth , 21–23 November , Malaga , Spain , (in press). Available from: http://electronic-health.org/poster_abstracts/ehealth2011_poster_GHSAG.pdf [Accessed 3 July 2012] .
  • Vaillant , L. , Barboza , P. , and Arthur , R.R. , 2011b . Epidemic intelligence: assessing event-based tools and user's perception in the GHSAG community . Proceedings of IMED 2011 , 4–7 February Vienna , , Austria .
  • von Etter , P. , Huttunen , S. , Vihavainen , A. Vourinen , M. , and Yangarber , R. , 2010 . Assessment of utility in Web mining for the domain of public health . Proceedings of NAACL HLT 2010 Workshop on Text and Data Mining of Health Documents , 5 June 2010 , California , USA , 29 – 37 .
  • Wagner , M.M. , Tsui , F.C. , Espino , J.U. , Dato , V.M. , Sittig , D.F. , Caruana , R.A. , McGinnis , L.F. , Deerfield , D.W. , Druzdzel , M.J. and Fridsma , D.B. 2001 . The emerging science of very early detection of disease outbreak . Journal of Public Health Management Practices , 7 ( 6 ) : 51 – 59 .
  • Wikipedia , 2009 . 2009 flu pandemic timeline [online] . Available from: http://en.wikipedia.org/wiki/2009_flu_pandemic_timeline [Accessed 25 September 2011] .
  • Wilks , Y. 2009 . Machine translation – its scope and limits , London : Springer .
  • Zamite , J. , Silva , F.A.B. , Couto , F. and Silva , M.J. 2010 . MEDCollector: multisource epidemic data collector . Lecture Notes in Computer Science , 6266 : 16 – 30 .