Frank van Harmelen
Professor, Knowledge Representation and Reasoning Group,
Faculty of Science,Vrije Universiteit Amsterdam
Frank van Harmelen (1960) is a professor in Knowledge Representation & Reasoning in the Computer Science department (Faculty of Science) at the Vrije Universiteit Amsterdam. After studying mathematics and computer science in Amsterdam, he moved to the Department of AI in Edinburgh, where he was awarded a PhD in 1989 for his research on meta-level reasoning. While in Edinburgh, he co-developed a logic-based toolkit for expert systems, and worked with Prof. Alan Bundy on proof planning for inductive theorem proving. After his PhD research, he moved back to Amsterdam where he worked from 1990 to 1995 in the SWI Department under Prof. Wielinga, on the use of reflection in expert systems, and on the formal underpinnings of the CommonKADS methodology for Knowledge-Based Systems. In 1995 he joined the AI research group at the Vrije Universiteit Amsterdam, where he now leads the Knowledge Representation and Reasoning Group.
Since 2000, he has played a leading role in the development of the Semantic Web, which aims to make data on the web semantically interpretable by machines through formal representations. He was co-PI on the first European Semantic Web project (OnToKnowledge, 1999), which laid the foundations for the Web Ontology Language OWL. OWL has become a worldwide standard, it is in wide commercial use, and has become the basis for an entire research community. He co-authored the Semantic Web Primer, the first academic textbook of the field and now in its third edition, which is in worldwide use (translations in 5 languages, 10.000 copies sold of the English edition alone). He was one of the architects of Sesame, an RDF storage and retrieval engine, which is in wide academic and industrial use with over 200,000 downloads. This work received the 10-year impact award at the 11th International Semantic Web Conference in 2012, which is the most prestigous award in the field.
In recent years, he pioneered the development of large scale reasoning engines. He was scientific director of the 10m euro EU-funded Large Knowledge Collider, a platform for distributed computation over semantic graphs with billions of edges. The prize-winning work with his student Jacopo Urbani has improved the state of the art by two orders of magnitude.
He is scientific director of The Network Institute. In this interdisciplinary research institute some 150 researchers from the Faculties of Social Science, Humanities and Computer Science collaborate on research topics in computational Social Science and e-Humanities.
He is a fellow of the European AI Society ECCAI (membership limited to 3% of all European AI researchers), in 2014, he was admitted as member of the Academia Europaea (limited to the top 5% of researchers in each field), and in 2015 he was admitted as Member of the Royal Netherlands Society of Sciences and Humanities (450 members across all sciences). He is a guest professor at the University of Science and Technology in Wuhan, China.
Title: The Web of Data: Do we actually understand what we built?
Abstract:The good news: a distributed knowledge-base that describes hundreds of millions of items through tens of billions of relations between them, classifying them into hundreds of thousands of different classes, hosted on a web of thousands of different servers across the world, with fully distributed access and open to contributions from anybody. A knowledge-base on this scale, of this size and of such broad coverage would have been unthinkable 15 years ago, but it has now become reality under a variety of names such as the Semantic Web, the Linked Open Data cloud, or the Web of Data.
The bad news: despite this success, we actually understand very little of the structure of the Web of Data. Its formal meaning is specified in logic, but with its scale, context dependency and dynamics, the Web of Data has outgrown its traditional model-theoretic semantics. Is the meaning of a logical statement (an edge in the graph) dependent on the cluster (“context”) in which it appears? Does a more densely connected concept (node) contain more information? Is the path length between two nodes related to their semantic distance? Properties such as clustering, connectivity and path length are not described, much less explained by model-theoretic semantics. Do such properties contribute to the meaning of a knowledge graph?
To properly understand the structure and meaning of knowledge graphs, we should no longer treat knowledge graphs as (only) a set of logical statements, but treat them properly as a graph. But how to do this is far from clear. In this talk, we’ll report on some of our early results on some of these questions, but we’ll ask many more questions for which we don’t have answers yet.
Principal Researcher, Microsoft Research Asia,
Dr. Ming Zhou, principal researcher, manager of Microsoft Research Asia Natural Language Computing Group. He is an expert in the areas of machine translation and natural language processing. He came to Microsoft in 1999 from his post as an associate professor of computer science at Tsinghua University. He designed the famous Chinese-Japanese machine translation software product J-Beijing in Japan which was granted Makoto Nagao Award, the highest rank of prize for machine translation products issued by Japan Machine Translation Association in 2008. He also designed the CEMT-I machine translation system in 1989, the first experiment of Chinese-English machine translation in Mainland China. He is the key inventor and technology leader of the famous AI gaming of Chinese Couplets Generation and the English Assistance Search Engine, Engkoo, which won the Wall Street Journal's 2010 Asian Innovation Readers' Choice Award and shipped in Bing in 2011 renamed as Bing Dictionary(http://cn.bing.com/dict/), and Engkoo cloud IME which was shipped as Bing IME in 2012.
Dr. Zhou has served as area chairs of ACL, IJCAI, AAAI, EMNLP, COLING, SIGIR, IJCNLP for many times, and PC chair of AIRS 2004, PC Chair of NLPCC 2012 and general chair of NLPCC 2013. NLPCC is the top NLP conference in China. He has authored or co-authored about 100 papers published at top NLP conferences such ACL. He was the co-director of MS-HIT Joint Lab on NLP and Speech during 2000-2008 and he currently is co-director of MS-Tsinghua Joint Lab on Media and Network since 2008. He is also the technical board member of Institute of Automation, Chinese Academy of Sciences. He has broad collaboration with universities on various projects on NLP, machine translation, text mining and social network including the famous project on sign language recognition and translation with Prof. Xilin Chen's team at Institute of Computer Technology, Chinese Academy of Sciences and Prof. Hanjing Li's team at Beijing Union University.
Dr. Zhou received three Science & Technology Promotion awards from the Ministry of China Aerospace for his research in machine translation, and one software patent of China for his Chinese spelling checking system. Since 1986, he has led several projects in machine translation, Chinese spelling checking and Chinese syntactic parsing funded by the China Natural Science Foundation and the Ministry of China Aerospace. He has conducted several cooperative projects in English-Chinese machine-aided translation, Japanese-Chinese machine translation, Chinese spelling checking, Chinese text information retrieval and Korean-Chinese machine translation with universities and companies in America, Japan, Hong Kong and Korea. He was in charge of the development of three commercial machine translation software products, including the DEAR translators workstation, the WinChar Chinese spelling checking system and the J-Beijing Chinese-Japanese machine translation system in China and Japan. He has served as a program committee member for several international conferences on natural language processing.
Dr. Zhou received his B.S. degree in computer engineering from Chongqing University in 1985, and his M.S. degree and Ph.D. in computer science from Harbin Institute of Technology in 1988 and 1991. He did post-doctoral work at Tsinghua University from 1991 to 1993, when he became an associate professor there. He visited the Chinese University of Hong Kong as a research associate in 1985 and the City University of Hong Kong as a research fellow in 1986. Between November of 1996 and March of 1999, he worked for Kodensha Ltd. Co. in Japan as the project leader of the Chinese-Japanese machine translation project that came out with the J-Beijing commercial software in 1998. Between April and August 1999, he was the leader of the NLP research group of the Department of Computer Science, Tsinghua University. He joined Microsoft Research China in Sept. 1999.
Title:The Progress of Knowledge-Based Question-Answering
Abstract:Knowledge-Based Question Answering (KB-QA）is a significant human-computer interaction technology for search engine, database, speech assistant, chat-bot and has broad applications in mobile internet, e-commerce, customer support and IoT. In recent years, driven by the strong needs of search engine and mobile internet and supported by the availability of large scale knowledge bases, KB-QA has made considerable progress and various novel methodologies have emerged. However, though we have seen plenty of exciting progress of KB-QA, there are still many challenging problems unsolved. The purpose of this talk is to carefully study the recent progress of KB-QA, analyze its challenges and accordingly propose new approaches in order to further boost it to the next level. In this presentation, I will start with an introduction to KB-QA including its definition, the knowledge bases readily available for KB-QA and an brief summary of its technical development in last 30 years. Then, I will elaborate two major technological approaches of KB-QA including semantic parser motivated QA (SP-QA) and IR motivated QA (IR-QA). The effective application of Deep Learning technologies demonstrated in recent work of KB-QA will also be described. To understand the advantage and limitations of the existing approaches of KB-QA, I made a solid comparison of various systems based on their performance on a public evaluation dataset and propose my suggestions on addressing the observed problems in the future. In the final part of this talk, I will introduce the work on KB-QA conducted at Microsoft Research Asia including the LIGHT QA system and the TRINITY graph engine which is used for the efficient storage and access of large scale KB.
Professor, Biomedical Knowledge Engineering Laboratory,Seoul National University, Korea
Hong-gee Kim is the director of Biomedical Knowledge Engineering Laboratory (BiKE), and a professor and dentistry library dean at School of Dentistry, Seoul National University (SNU). Prior to SNU, he was an associate professor of Management Information Systems at Dankook University. He is adjunct professors in Computer Science, Collaborative Medical Informatics Program, Collaborative Cognitive Science Program, Collaborative Archiving Studies Program, and Graduate School of Convergence Science and Technology at Seoul National University. At Seoul Womens University, he is a lecturer at the Department of Contemporary Art. Overseas, he is an adjunct professor in Informatics at the National University of Ireland, and visited Harvard University Medical School as a visiting professor.
He earned degrees in Philosophy, Psychology, and Computer Science. He has authored more than 200 papers in journals and conference proceedings that cover diverse topics in computer science, medicine and dentistry, biology, cognitive science, law, and industrial engineering. His current research interests include Semantic Web in bio-medical informatics, Ontology Engineering, and Semantic Knowledge Space.
Recently, he is actively involved in Exo-brain project, which develops a cluster-based triple store that houses large-scale RDF data with SPARQL query support, and creates a scalable, hybrid ontology mapping tool that handles large-scale ontologies in Linked Data and combines instance- and lexical-based mapping approaches. He also leads development of a genome analysis research platform with multiple, linked –omics data. The project develops search capabilities over consolidated –omic data equipped with rich semantics, and provides –omic data analysis functions over integrated linked –omics data. He pioneered FARM (FCA-based Association Rule Miner) that supports not only traditional FCA (Formal Concept Analysis) features such as formal context, formal concept and concept lattice but also association rule extraction for a given user's interest in the form of Boolean expression. By taking advantage of the structural characteristics of a concept lattice, it efficiently extracts and visualizes non-redundant association rules corresponding the given user's interest.
Title: Philosophy of Semantics-based Data Intensive Science
Abstract:The face of science assumes different masks at times from experimental to theoretical to computational sciences. Experiments are devised to test hypotheses, which require a priori descriptive, exploratory, inductive theory by means of models and generalization, which in turn involves in many cases untenable complication to solve analytically. Recently, computers armored with massive processing power are used to simulate complex scientific phenomena, generating unprecedented numbers and kinds of data. Scientists now are facing large and complex data housed at distributed data centers, constantly and quickly fed from various sources, such as sensors, simulators, experiments, literature, and the Web, which demands at least ways for representing semantics, I mean, much semantics.
The Semantic Web offers Web-enabled technologies that allow linking of various data sets, with semantics afforded by, in the main, RDF and OWL. These standards provide a means to define semantics which then can be interpreted and reasoned for further operations such as integration and reuse of knowledge. The Semantic Web has fired up numerous studies, notably in biology and medicine, proving its usefulness as witnessed by the wide adoption of Gene Ontology. Now, previously isolated data islands are gradually being interlinked that use the Semantic Web technologies. With such explosion of data produced and interlinked with well-defined semantics, we are witnessing an inception of a new paradigmatic shift to what can be termed as “semantics-based data intensive science.” What lies ahead in the new science is not yet clearly seen- if necessary, we may even need to revisit and reinforce our current approaches in order for enhanced data semantics, learned from, for example, cognitive sciences and psychology, and streamlined research workflow, and bidirectional micro-to-macro knowledge integration such as gene to protein to cell, or even to an organism, and vice versa. We are only to maintain openness to innovation, and it may well be that the new science is not simply a science but an art which embraces great variability of ideas and approaches as we humans are variable.
Principal Architect, Baidu
Jizhou Huang, principal architect, tech lead of Recommendation Team of Baidu Web Search. He is an expert in the areas of Recommendation, Image Search and Query Understanding. He joined Baidu in January 2010, hosted and participated in the development of a number of products. He was in charge of the development of Query Understanding module for Box Computing (http://boxcomputing.baidu.com/). He has led several projects in Baidu Image Search (http://image.baidu.com/), including search algorithms, search suggestions, related search, and recommendation system. He is the technology leader of Recommendation Team of Baidu Web Search (https://www.baidu.com/), which won the 2014 Baidu Million Dollar Prize. In addition, he also applied more than 100 patents in several countries, and was given "Baidu Special Invention Awards" for continually excellence in technical innovation.
Title: Recommendation for Web Search: From Big Data to Big Impact
Abstract: Recommendation, providing search suggestions related to the query, has become a key feature of today’s web search engine. In order to better satisfy users’ information needs, search engines are increasingly aiming to extend the search results by providing richer information beyond just showing the usual links. Over the past few years, commercial web search engines (such as Baidu, Google, etc.) have enriched user experiences by recommending related topics (such as books, music, novels, movies, and games) that may be of interest to users to unleash their potential demands. This talk aims to present the current state of research in the recommendation team of Baidu web search (https://www.baidu.com/), including several innovations as well as industry practices using big data to improve user engagements and satisfactions.