第二十四届国际模式识别大会（24th International Conference on Pattern Recognition, ICPR 2018）将于2018年8月20日至24日在北京国家会议中心召开，这是其创办40多年来第一次在中国内地召开。本次大会是由国际模式识别联合会、万博学会、中国科学院自动化研究所主办，届时模式识别、机器学习、计算机视觉等相关领域海内外知名专家共聚一堂，交流相关研究领域的最新成果和发展趋势。
本届会议主题、特邀报告及神秘嘉宾：本次会议分为6个主题（模式识别和机器学习、计算机视觉、语音图像视频和多媒体、生物识别技术和人机交互、文档分析和识别以及生物医学成像和生物信息学）。除了口头报告和海报展示，大会很荣幸的邀请到6位演讲者做主题演讲(Zhi-Hua Zhou，Long Quan，Jianchang Mao, K. Venkatesh Prasad, Ashok Popat, Alison Noble)。还会有3位获得IAPR荣誉奖项的嘉宾(King Sun Fu prize: Matti Pietikainen, J.K. Aggarwal prize: Kristen Grauman, Maria Petrou prize: Rita Cucchiara)。
King Sun Fu奖获得者：Matti Pietikainen，教授，奥卢大学，芬兰
J.K. Aggarwal奖获得者：Kristen Grauman，教授，德克萨斯大学奥斯汀分校，美国
Maria Petrou奖获得者:Rita Cucchiara，教授，摩德纳大学，意大利
Hong Kong University of Science and Technology, China
Title: The Challenges of 3D Reconstruction with Deep Learning
In this talk, I will review the developments in computer vision and visual learning over the past. Then, I will turn the focus on recent exciting work in deep visual learning and 3D reconstruction breakthrough in computer vision. Here, I showcase the reconstruction approaches in large-scale of hundreds of square kilometers high-rise metropolitan areas and undeveloped rural areas from drones, and in small-scale daily objects from smartphones. I also demonstrate the online cloud platform and portal www.altizure.com with its crowd-sourced Altizure Earth, developed and funded by the HKUST team, rivaling the popular Google Earth!
Long Quan received the Ph.D. in Computer Science at INRIA, France, in 1989. Before joining the Department of Computer Science at the Hong Kong University of Science and Technology (HKUST) in 2001 to found his computer vision group, he has been one of the founding members of INRIA Grenoble Computer Vision Group since 1990.
He directed the founding best French PhD thesis in computer science by Peter Sturm, le prix de these Gilles Kahn in 1998, the Piero Zamperoni Best Student Paper Award in 2000 by Maxime Lhuillier, the first of six highlights of SigGraph 2007, the Best Student Poster Paper of CVPR 2008. His many graduate students are now world computer vision leaders at INRIA and CNRS in France, Lund University in Sweden, NUS in Singapore, Beijing University, Alibaba and DJI in China, SFU in Canada, and Microsoft, Google, and Princeton in USA.
He has served in all the major computer vision journals, as an Associate Editor of IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), a Regional Editor of Image and Vision Computing Journal (IVC), an editorial board member of the International Journal of Computer Vision (IJCV), an editorial board member of the Electronic Letters on Computer Vision and Image Analysis (ELCVIA), an associate editor of Machine Vision and Applications (MVA), and an editorial member of Foundations and Trends in Computer Graphics and Vision.
He has contributed to all the major computer vision conferences, IEEE International Conference on Computer Vision (ICCV), European Conference on Computer Vision (ECCV), and IEEE Computer Vision and Pattern Recognition (CVPR), and IAPR International Conference on Pattern Recognition (ICPR). He served as a Program Chair of ICPR 2006 Computer Vision and Image Analysis, a Program Chair of ICPR 2012 Computer and Robot Vision, a General Chair of the ICCV 2011 in Barcelona, and a General Chair of the IEEE CVPR 2022 in New Orleans. He is the founding director of the HKUST Center for Visual Computing and Image Science. He is also an IEEE Fellow of the Computer Society.
Most recently, with his HKUST graduates, he founded altizure.com, the world’s first portal for generating 3D from drone and smartphone photos!
K. Venkatesh Prasad
Ford Motor Company, USA
Title: Automobiles and Mobility Solutions
As human intelligence, imagination & ingenuity continue to create advancements in machine-intelligence, we have new ways to serve the mobility needs of our planet. With a world population of about 7.6 billion and immense human and machine intelligence at our disposal, we have the opportunity to create novel experiences and related services associated with traveling from “A” to “B.” Thanks in no small part to advancements in pattern recognition, computer vision and image processing, automobiles are getting “smart” and growing more aware of their surroundings. The world is also getting “smart.” In this talk, we outline some key applications areas of machine intelligence to applications, in the context of addressing human mobility needs.
K. Venkatesh Prasad is the Senior Technical Leader for Mobility and a member of the Ford Technology Advisory Board for Open Innovation. Prior to this role, he was Ford’s Global Innovation Implementation Leader, Vehicle Components & Systems Engineering and during a 3-year period help establish eight makerspaces for employee-innovation across global engineering centers. In the earlier years, Prasad applied computer vision, based on early CMOS cameras, to several automobile applications including automatic headlamp detection. In 2011, Prasad architected OpenXC, the industry’s first open-source hardware and open-source software platform, an “innovator’s toolkit,” which launched in 2013 and today is one of the tools used by Ford employee-innovators to design, test and release products and by researchers and experimenters the world over. He also co-founded Ford’s startup-lab in 2012 as a 5-person office; a year later, it scaled to become Ford’s Innovation Center Palo Alto and today is a 150-person operation. Prasad earned a Ph.D. in electrical and computer engineering from Rutgers University in 1990, an M.S. from Washington State University, and engineering degrees from IIT-Madras and NIT-Trichy in India. He has more than 25 years of collaborative experience with universities, startups, automotive suppliers and technology firms. He has co-edited three issues of the Proceedings of the IEEE (on Automotive Technologies; Aerospace and Automotive Software and Cyber-Physical Systems). Prior to coming to Dearborn, Michigan, in 1996, Prasad worked in Menlo Park, California (at Ricoh Innovations) and before that in Pasadena, California (at Caltech and, as a faculty affiliate, at the NASA Jet Propulsion Laboratory).
Jianchang Mao, Microsoft, USA
Title: Achieving Human Parity Performance in Pattern Recognition and Language Understanding by Machines
For more than a half century, computer scientists have been attempting to train computer systems to perform human perception and cognition tasks, such as, recognize image and speech, comprehend text, translate languages, etc. But until recently those systems were plagued with stagnated accuracies that were far below human performance. In recent years, with the breakthroughs in Deep Learning, advances in the state-of-the-art performance of those systems have gained a strong momentum, thanks to the rapid increase in computing power, big data, and advances in machine learning algorithms. Today, AI breakthroughs are coming at an accelerated pace. The performance of computer systems on several perception and cognition tasks has reached human parity. For example, in 2015 Microsoft researchers achieved 96% accuracy in the ImageNet Computer Vision Challenge, which is as good as a Stanford graduate student. Less than a year later, Microsoft’s speech recognition system achieved 5.1% error rate on the Switchboard dataset, which is at parity with professionals who do transcription! In January 2018, Microsoft was the first to achieve human parity in text comprehension tasks on the Stanford Question Answering Dataset. And two months later, Microsoft announced that it reached human parity in English-to-Chinese and Chinese-to-English machine translation on the news dataset. In this talk, I will briefly describe our journey to achieving human parity on these tasks and the technologies that enabled the breakthroughs. I will also present other applications of Deep Learning, such as OCR in unconstrained environments and Advertising.
Dr. Jianchang (JC) Mao is Corporate Vice President of Bing Ads Marketplace & Serving, Artificial Intelligence & Research division at Microsoft. He leads a global team of engineers, scientists, product managers, marketplace operators, and analysts, responsible for building technologies and products, and running multi-billion-dollar advertising marketplace that powers Bing, Yahoo!, AOL, and other syndication partners.
Prior to joining Microsoft, Mao was Vice President and Head of Advertising Sciences at Yahoo! Labs, overseeing the R&D of advertising technologies and products. He was also the science and engineering director responsible for the development of backend technologies for several Yahoo! social search products, including Yahoo! Answers. At Yahoo!, Mao received the Leadership Superstar Award in 2010, and received a Superstar Team Award in 2008. Prior to joining Yahoo!, Mao was director of emerging technologies and principal architect at Verity Inc., a leader in Enterprise Search (acquired by Autonomy and then acquired by HP), from 2000 to 2004. Mao began his career as a research staff member at the IBM Almaden Research Center from 1994 to 2000, after receiving his PhD degree in computer science from Michigan State University in 1994.
Mao’s research interests include AI, machine learning, data mining, information retrieval, computational advertising, pattern recognition, and image processing. He has published more than 50 papers in journals, book chapters, and conferences, and holds 29 U.S. patents. Mao received an Honorable Mention Award in ACM KDD Cup 2002 (Task 1: Information Extraction from Biomedical Articles), an IEEE Transactions on Neural Networks Outstanding Paper Award in 1996 (for his 1995 paper), and an Honorable Mention Award from the International Pattern Recognition Society in 1993. Mao is a Fellow of IEEE.
Ashok Popat, Google, Inc., USA
Title: Advice to a Promising OCR Researcher
Document Analysis and Recognition remains a vibrant and challenging field, spanning and touching several domains, including pattern recognition, computer vision, linguistics, digital humanities, and augmented reality. Probably most of the best work in this field remains to be done. That work will build on what came before — in terms of techniques and understanding already achieved, but also by learning from the best practices of our colleagues and predecessors. As an OCR researcher, in this talk I’ll try to reflect on some of the advice I’ve received from mentors, colleagues, and others in various places, including MIT, Xerox PARC, and Google. I’ll present the ideas in the context of developing an Optical Character Recognition system at Google.
Ashok C. Popat received the SB and SM degrees from the Massachusetts Institute of Technology in Electrical Engineering in 1986 and 1990, and a PhD from the MIT Media Lab in 1997. He is a Research Scientist at Google in Mountain View, California. At Google he has worked on several projects, including Books, Translate, and (most recently) Optical Character Recognition (OCR). He is part of a team that has developed an OCR system that can handle more than 200 languages, many of which are currently supported through the Cloud Vision web-based API. Prior to joining Google in 2005 he worked at Xerox PARC with Gary Kopec and Henry Baird, on Document Image Decoding. Between 2002 and 2005 he was also a consulting assistant professor of Electrical Engineering at Stanford, where he co-taught (with Dan Bloomberg) a course “Electronic documents: paper to digital.” He has also worked at Motorola, Hewlett Packard, PictureTel, and the EPFL in Switzerland. His areas of interest include signal processing, data compression, and pattern recognition. He enjoys running, skiing, sailing, hiking, and spending time with his wife and two daughters.
Nanjing University, China
Title: An Exploration to Non-NN Style Deep Learning
Deep learning is a hot topic during the past few years. Generally, the word “deep learning” is regarded as a synonym of “deep neural networks (DNNs)”. In this talk, we will discuss on essentials in deep learning and claim that it is not necessarily to be realized by neural networks. We will then present an exploration to non-NN style deep learning, where the building blocks are non-differentiable modules and the training process does not rely on backpropagation.
Zhi-Hua Zhou is a Professor of Nanjing University, China. He is the Head of the Department of Computer Science and Technology, Dean of the School of Artificial Intelligence, and Founding Director of the LAMDA Group. His main research interests are in artificial intelligence, machine learning and data mining. He authored the books “Ensemble Methods: Foundations and Algorithms (2012)” and “Machine Learning (in Chinese, 2016)”, and published more than 150 papers in top-tier international journals/conferences. According to Google Scholar, his publications have received more than 30,000 citations, with an H-index of 85. He also holds 22 patents and has rich experiences in industrial applications. He has received various awards, including the National Natural Science Award of China, PAKDD Distinguished Contribution Award, IEEE ICDM Outstanding Service Award, etc. He serves as the Executive Editor-in-Chief of Frontiers of Computer Science, and Action/Associate Editor of Machine Learning, IEEE PAMI, ACM TKDD, etc. He was Associate Editor of ACM TIST, IEEE TKDE, IEEE TNNLS, IEEE TCDS, etc. He founded ACML (Asian Conference on Machine Learning) and served as General Chair of IEEE ICDM 2016, Program Chair of IJCAI 2015 Machine Learning track, etc. He will serve as Program Chair of AAAI 2019 and IJCAI 2019. He is the Chair of CCF-AI, and was Chair of the IEEE CIS Data Mining Technical Committee. He is a foreign member of the Academy of Europe, and a Fellow of the ACM, AAAI, AAAS, IEEE, IAPR, CCF and CAAI.
Alison Noble, University of Oxford, UK
Title: Human Intelligence, Artificial Intelligence and How They Are Changing Ultrasound Image Analysis
Ultrasound imaging is widely used in clinical practice but requires expertise to acquire images and interpret them. Recent advances in machine learning applied to imaging are changing the way we can analyse ultrasound images and extract clinically useful information from ultrasound images and video. Ultrasound images are, after all, “just” spatial maps of acoustic patterns so we would hope that the pattern-recognition power of machine learning would be well-suited for their analysis. In this talk I will describe some recent work of my group on machine learning applied to ultrasound image analysis, some of the interesting challenges specific to this application domain, and highlight some emerging topics of research interest.
Professor Alison Noble is the Technikos Professor of Biomedical Engineering at the Institute of Biomedical Engineering, University of Oxford UK. She is best known for her group’s research on ultrasound image analysis much of which has involved inter-disciplinary collaborators with clinical partners. Her current interests are in machine learning applied to ultrasound imaging with application to fetal medicine in the developed world and LMICs, ranging from developing next generation tools for non-expert users of ultrasound technology, to point-of-care computer-assisted basic ultrasound assessment. Throughout her career she has maintained a keen interest in the commercialization of scientific research as a pathway to realizing impact of academic research. She co-founded and is a consultant to Intelligent Ultrasound Ltd, which became part of MedaPhor Group PLC in 2017.
Professor Noble served as the President of the Medical Image Computing and Computer-Assisted Interventions (MICCAI) Society from 2013-16. She is a European Research Council Advanced Research award holder. She is a Fellow of the Royal Academy of Engineering (2008) and a Fellow of the Royal Society (2017) and was awarded an OBE for services to science and engineering in the Queen’s Birthday Honours 2013.
Tieniu Tan (China)
Josef Kittler (UK)
Anil Jain (USA)
Cheng-Lin Liu (China)
Rama Chellappa (USA)
Matti Pietikäinen (Finland)
Local Arrangements Chair（本地组委会主席）:
Liang Wang (China)
Jianhua Tao (China)
International Liaison Chair（外联主席）:
Gunilla Borgefors (Sweden)
Invited Speakers Chairs（特邀报告主席）:
Katsushi Ikeuchi (China)
Denis Laurendeau (Canada)
Ingela Nystrom (Sweden)
David Suter (Australia)
Zhaoxiang Zhang (China)
Yingli Tian (USA)
Greg Mori (Canada)
Zhouchen Lin (China)
Dimosthenis Karatzas (Spain)
Xiang Bai (China)
David Doermann (USA)
Jean-Marc Ogier (France)
Umapada Pal (India)
Daniel Lopresti (USA)
Ran He (China)
Sponsorship and Exhibitions Chairs（赞助与展览主席）:
Yasushi Yagi (Japan)
Qiang Ji (USA)
Andreas Dengel (Germany)
Tao Wang (China)
Local Arrangement Committee Members（本地组委会成员）:
Junliang Xing, NLPR, CASIA
Bin Fan, NLPR, CASIA
Shibiao Xu, NLPR, CASIA
Tianzhu Zhang, NLPR, CASIA
Jing Dong, NLPR, CASIA