Keynote Speaker
Dr. Kirk Borne
Data Scientist & Top Worldwide Data Science Influencer
Dr. Borne is a Data Scientist and Top Worldwide Data Science Influencer. He is a career data professional, data science leader, and research scientist with a background in astrophysics. Previously, Kirk was professor of Astrophysics and Computational Science at George Mason University. Before that, he spent 20 years supporting data systems activities for NASA space science missions, including roles with the Hubble Space Telescope and NASA’s Space Science Data Operations Office. Kirk earned his Ph.D. in astronomy from the California Institute of Technology. He is a top worldwide influencer on social media where he promotes data science, AI, machine learning, and data literacy for all.
Featured Speaker
Conference Schedule
**PLEASE NOTE: ALL EVENTS ARE GIVEN IN CENTRAL TIME**
9:30am - 11:00am CDT
Registration & Networking
11:00am - 11:45am CDT
Keynote Address
Dr. Kirk Borne: Intelligent Data Understanding for Smarter AI.
11:45am - 12:00pm CDT
Break
12:00pm - 1:00pm CDT
Concurrent Sessions
Workshop – Rebeca Pop: Telling Stories with Data
Debbie Reynolds: The Data Privacy fundamentals & challenges of Data Analytics & AI
Troy Hernandez, Ph.D: Tools Enabling Data Science
1:15pm - 2:15pm CDT
Concurrent Sessions
Workshop – Dr. Mimi Tsuruga: Analyzing Life Data with Elastic Search
Workshop – Rehgan Avon: ModelOps
Tyrone Grandison, Ph.D: Civil Data & Data for Government
Panel – Diversity, Equity and Inclusion
2:30pm - 3:30pm CDT
Fireside Chat – Dr. Chetan Gupta: Industrial AI
3:30pm - 3:45pm CDT
Break
3:45pm - 4:45pm CDT
Concurrent Sessions
Dr. David Bader: High Performance Data Analytics
Panel – My Odyssey; My Journey to Data
Roger Moore: Delivering Value with Data Science, AI & ML
5:00pm - 6:00pm CDT
Concurrent Sessions
EDI @ Fermilab: Equity, diversity, and inclusion encourage innovation and persistence. Fermilab strives to create more outlets for diverse recruiting and hiring while also creating a more welcoming environment for the existing workforce. This session will provide insight into Fermilab’s approach to equity, diversity, and inclusion and introduce a panel of Fermilab engineers and scientists to share their personal experiences.
Dr. Tanya Berger-Wolf- Trustworthy AI for Wildlife Conservation: AI and Humans Combating Extinction Together
Panel – Future of Work & Education in Data Science
- Mustafa Bilgic, David Uminsky, and Jonathan Williams
6:00pm - 6:30pm CDT
Closing Remarks
DataYap Virtual Conference
Inspiring Speakers
Hear from seasoned tech professionals as they tackle cutting edge topics including data security, modern analytics applications of data visualization, and the future of work.
Networking Opportunities
Chat with tech professionals across all career stages during networking breaks. Get to know seasoned and connected panelists and keynote speakers.
Career Development
Learn what it takes to start (or advance) your career in tech. Get advice from fellow professionals and make great new connections.
Sponsors & Partners
Dr. Kirk Borne
Keynote
Intelligent Data Understanding for Smarter AI
Abstract
This presentation will discuss benefits and applications of intelligent data understanding for smarter AI operations. Intelligent data understanding produces “smart data”, which are labeled, tagged, and annotated data. Labels, tags, and annotations capture the content, context, uses, sources, and characterizations (patterns and features) associated with the data. Smart data can be generated through machine learning, or applied by human experts. Labels can be learned and applied in existing data lakes, in streaming data, and in sensor data (collected in devices at the “edge”). Consequently, intelligent data understanding thrives at the convergence of AI and IoT (Internet of Things). Labels are curated and stored with the data, thus enabling curation, cataloguing (indexing), search, orchestration, delivery, and effective use of the “right data” in AI applications, including data-driven decision-making and autonomous operations. Intelligent data understanding thus meets the needs for smarter AI operations, which must devour streams of data – not just any data, but smart data – the right data at the right time in the right context.
Bio
Dr. Borne is a Data Scientist and Top Worldwide Data Science Influencer. He is a career data professional, data science leader, and research scientist with a background in astrophysics. Previously, Kirk was professor of Astrophysics and Computational Science at George Mason University. Before that, he spent 20 years supporting data systems activities for NASA space science missions, including roles with the Hubble Space Telescope and NASA’s Space Science Data Operations Office. Kirk earned his Ph.D. in astronomy from the California Institute of Technology. He is a top worldwide influencer on social media where he promotes data science, AI, machine learning, and data literacy for all.
David Bader, Ph.D.
Topic
Solving Global Grand Challenges with High Performance Data Analytics
Abstract
Data science aims to solve grand global challenges such as: detecting and preventing disease in human populations; revealing community structure in large social networks; protecting our elections from cyber-threats, and improving the resilience of the electric power grid. Unlike traditional applications in computational science and engineering, solving these social problems at scale often raises new challenges because of the sparsity and lack of locality in the data, the need for research on scalable algorithms and architectures, and development of frameworks for solving these real-world problems on high performance computers, and for improved models that capture the noise and bias inherent in the torrential data streams. In this talk, Bader will discuss the opportunities and challenges in massive data science for applications in social sciences, physical sciences, and engineering.
Bio
David A. Bader is a Distinguished Professor in the Department of Computer Science and Director of the Institute for Data Science at New Jersey Institute of Technology. Prior to this, he served as founding Professor and Chair of the School of Computational Science and Engineering, College of Computing, at Georgia Institute of Technology. He is a Fellow of the IEEE, AAAS, and SIAM, and advises the White House, most recently on the National Strategic Computing Initiative (NSCI). Dr. Bader is a leading expert in solving global grand challenges in science, engineering, computing, and data science. His interests are at the intersection of high-performance computing and real-world applications, including cybersecurity, massive-scale analytics, and computational genomics, and he has co-authored over 250 articles in peer-reviewed journals and conferences. Dr. Bader has served as a lead scientist in several DARPA programs including High Productivity Computing Systems (HPCS) with IBM, Ubiquitous High Performance Computing (UHPC) with NVIDIA, Anomaly Detection at Multiple Scales (ADAMS), Power Efficiency Revolution For Embedded Computing Technologies (PERFECT), Hierarchical Identify Verify Exploit (HIVE), and Software-Defined Hardware (SDH). He has also served as Director of the Sony-Toshiba-IBM Center of Competence for the Cell Broadband Engine Processor. Bader is a cofounder of the Graph500 List for benchmarking “Big Data” computing platforms. Bader is recognized as a “RockStar” of High Performance Computing by InsideHPC and as HPCwire’s People to Watch in 2012 and 2014. In April 2019, Bader was awarded an NVIDIA AI Lab (NVAIL) award, and in July 2019, Bader received a Facebook Research AI Hardware/Software Co-Design award.
Tony Baylis
Director, Diversity, Equity and Inclusion
Bio
Tony Baylis is a senior leader, partner, and advocate for Diversity, Equity and Inclusion (DEI) programs and activities for Lawrence Livermore National Laboratory (LLNL). Tony manages the Laboratory’s strategic interactions and execution in building a diverse and inclusive workforce at LLNL. He collaborates with academia, government, industry, community, and diversity organization stakeholders. Tony represents the Laboratory on the subjects of DEI, science, engineering, arts, and mathematics (STEAM), outreach, and student programs.
Tony has created and implemented successful inclusive programs focused on increasing the representation of women and the underserved communities in various organizations and industries. He has mentored over 200 students and professionals throughout his career. Tony serves as a Department of Energy champion, a Board Member for the EmpowHer Institute, an Advisory Board Member for the Computing Alliance of Hispanic-Serving Institutes (CAHSI), an Industry Advisory Board Member for the University of Florida Computer & Information Science & Engineering Department (CISE), an Advisory Board Member at Polytechnic University of Puerto Rico (PUPR) for the Center of Academic Excellence (CAE) in Cyber Defense Education (CDE), an Advisory Council Member at the Center for Black Studies Research at University of California, Santa Barbara, an AccessComputing Industry Partner, an Inclusion Allies Coalition Member, and a Diversity, Equity & Inclusion consultant. Tony recently received the honor of becoming the GEM Fellow Program Employer Representative of the Year for 2020.
Tony also serves as a conference program committee member, a speaker, moderator, facilitator, contributor, and reviewer for a number of programs, conferences, and workshops, such as The Centre for Global Inclusion, International Earth-Life Science Institute, Eurographics, IEEE VR, Supercomputing Conference Series, National Science Foundation, Grace Hopper Conference, Richard Tapia Diversity in Computing Conference, American Indian Science and Engineering Society, and many others. He works with a number of Minority Serving Institutions (MSIs), specifically Historically Black Colleges and Universities (HBCUs), Hispanic Serving Institutions (HSIs), and American Indian and Alaska Native-Serving Institutions.
Tony is an Association for Computing Machinery (ACM) Special Interest Group on Computer Graphics and Interactive Techniques (SIGGRAPH) member and currently serves SIGGRAPH in the role of Diversity and Inclusion Committee Chair. He is also a member of the Society for Human Resource Management (SHRM). Tony is a graduate from the University of Illinois, Champaign-Urbana, and LLNL’s Leadership Program, and he has completed Cornell University’s Diversity and Inclusion Certificate Program. Tony’s career represents 35 years of administrative, project, program, technical, organizational management and senior leadership. He has worked in industry, broadcast media, scientific and technical environments for over 30 years.
His passions are his children, to lead by example, deliver results, demonstrate allyship daily, live a fully inclusive life, be socially conscious, learn continuously, travel with curiosity, and serving others.
Rebeca Pop
Workshop Title
Telling Stories with Data
Description
“Don’t just show the notes, play the music!” – Hans Rosling, Swedish physician, academic, and master storyteller.
Most organizations are aware that developers, data scientists and analysts are essential to creating a data-driven environment. Fewer understand how to use data and charts to tell powerful stories that don’t only convey the “what”, but also the “so what” and the “what next.” In other words, most organizations already have the notes. Now, they need to learn how to play the music.
In this 45-minute workshop, Rebeca will walk you through her process of telling stories with data, followed by a short group exercise in which you’ll start thinking about how to craft a data story.
Bio
Rebeca Pop is the founder of Vizlogue, a Data Visualization and Storytelling Lab that offers training and consulting services. Vizlogue’s mission is to help companies and organizations communicate more effectively with data. To date, Rebeca has delivered presentations to over 1,500 participants all around the world. Her workshops are grounded in a deep understanding of adult learning strategies and combine hands-on exercises with feedback sessions and real-life examples. Rebeca also teaches data visualization at the University of Chicago and at Northwestern University.
Dr. Mustafa Bilgic
Bio
Dr. Mustafa Bilgic is an Associate Professor, the Director of the Masters in Artificial Intelligence and BS in Artificial Intelligence programs, and the Director of the Machine Learning Laboratory in the Computer Science Department at Illinois Institute of Technology. He received his BS in Computer Science from the University of Texas at Austin and his MS and PhD in Computer Science from the University of Maryland at College Park. He received an NSF CAREER award. His research interests include active and interpretable machine learning, statistical relational learning, recommender systems, and probabilistic graphical models.
Dr. Tyrone W A Grandison
Bio
Dr. Tyrone W A Grandison holds multiple Chief Technology positions at The TeleHealth Market, Hodos Health, and MStreetX. He is also the Founder and Board Chairman of The Data-Driven institute, which is a public health non-profit that helps policymakers create and implement effective programs and policies to solve their most critical problems, using the knowledge of the community, data and technology. While at the Data-Driven Institute, Dr. Grandison had the honor of serving in several positions for the company’s clients and partners. He served as Executive Leadership Advisor for Democracy Works. He served as Civic Tech Program Manager for the US Census Bureau. He served as the first Chief Technology Officer for Pearl Long Term Care Solutions. He served as the Vice President of Data for U.Group. He serves as Chief Security Officer for POCMI. He serves on the Board of Advisors for Citizens for Citizens Fund.
Dr. Chetan Gupta
Title
Fireside chat on Industrial AI
Description
Industrial AI is concerned with the application of Artificial Intelligence (AI), Machine Learning (ML) and related technologies towards addressing real-world use cases in industrial and societal domains and has the potential to transferor the world we live in. Industrial AI uses cases can be broadly categorized into the horizontal areas of maintenance and & repair, operations & supply chain, quality, safety, design, and end-to-end optimization. etc. and have applications in a variety of domains. In this chat we will give some real world example, highlight challenges and lessons learned and point out new research directions and developments.
Bio
Chetan Gupta is VP, Chief Data Scientist & Architect, and is the Head of the Industrial AI Lab at Hitachi America, Ltd. R&D. He has more than 15 years of experience in analytics, AI, big data, and related domains. Over his career he has worked both as a machine learning data scientist as well as in designing systems and architectures for big data applications. At Hitachi, he manages a large team of data scientists, architects and developers that is engaged in developing cutting edge solutions and opening new frontiers in industrial analytics. His team builds fundamental horizontal technologies that are then used to build solutions for industry specific verticals. He has led efforts to build horizontal solutions in predictive maintenance, quality, operations monitoring and control, and for verticals such as mobility, mining, building energy management systems, etc. Over the years Chetan has led multiple research and development teams and mentored young researchers. He has more than 100 papers and patents in the area of Industrial AI, data mining/machine learning, data stream systems, complex event processing, workload management, etc. Chetan has a Ph.D. in Mathematics and M.S. in Mathematical Computer Science and Chemical Engineering from University of Illinois, Chicago.
Roger Moore
Topic
Delivering Value with Data Science, AI, and ML
Description
The days of the “golden gut” are gone. Senior executives need to leverage all the data and analytics they can muster to make decisions and execute on opportunities that create a competitive advantage. Enter the data science team. Their big data, artificial intelligence, and machine learning provide the information and insight executives need. However, there is a gap limiting the ability of the data science team and executives to fully realize their collective potential. Understanding how the organization makes decisions, the roles that each group plays in the decision-making process and how to eliminate the gap has a promise of creating data and analytics-driven competitive advantage.
Bio
Roger Moore (roger@nlitx.com) is the CEO and Managing Director at NLITX (www.nlitx.com). NLITX uses data and analytics to solve strategic problems. Roger is also a Lecturer at the University of Chicago. He teaches Leadership Skills in the Master of Science in Analytics Program. Previously Roger was VP, Analytics and Customer Operations at Entytle. Entytle provides a SaaS tool enhancing customer installed base automated sales with AI and machine learning. Roger has also worked at Gartner, Sagence, Booz & Co, PwC, Diamond Management & Technology Partners, and the Boston Consulting Group.
Roger is active in the data and analytics communities in the Chicago area. He attends many MeetUps, roundtables and industry events. Roger founded and chairs the Chicago Booth Big Data & Analytics Roundtable.
Roger holds an MBA with focus in Finance and Statistics from the University of Chicago’s Booth School of Business. He also received a BS in Electrical and Computer Engineering from the University of Wisconsin – Madison.
Debbie Reynolds
Topic
The Data Privacy fundamentals and challenges of Data Analytics and AI
Description
The global Data Privacy regulation landscape is expanding as Data Analytics, and AI is more widely developed and adopted. Although understanding regulations are essential, it is likely more important that we look at the fundamental themes that can guide the robust development of Data Analytics and AI. This discussion will outline the five fundamental data privacy themes that any data process can adopt as guiding principles.
Bio
Debbie Reynolds is the Founder, CEO, and Chief Data Privacy Officer of Debbie Reynolds Consulting LLC. Debbie Reynolds, “The Data Diva,” is a world-renowned technologist, thought-leader, and advisor to Multinational Corporations for handling global Data Privacy, Cyber Data Breach response, and complex cross-functional data-driven projects. Ms. Reynolds is an internationally published author, highly sought speaker, and top media presence about global Data Privacy, Data Protection, and Emerging Technology issues. Ms. Reynolds has been named to the Global Top 20 CyberRisk Communicators by The European Risk Policy Institute, 2020, and recognized as one of the stellar women who know Cyber by Cybersecurity Ventures in 2021.
Ms. Reynolds is the author of works in books, The GDPR Challenge: Privacy, Technology, and Compliance In An Age of Accelerating Change, and eDiscovery for Corporate Counsel; She is the author of works in publications like The International Journal for the Data Protection Officer, Privacy Officer, and Privacy Counsel, Bloomberg Law, Thomson Reuters West, Westlaw Journal, Today’s General Counsel Magazine (TGC), Law360 and the International Legal Technology Association (ILTA); She has been interviewed and quoted in media outlets, Tycoon, Authority Magazine, Medium, Lifewire, CMSWire, Bloomberg Big Law Business, Public Broadcasting Service (PBS), Digiday, Privacy Laws and Business, Identity Review, Biometric Update, LegalTech News, Law.com, Law360, The Recorder, High Performance Counsel (HPC), Legal Business World, Toyo Keizai Japan, and American Lawyer.
Prof. David Uminsky
Bio
David Uminsky joined the University of Chicago in September 2020 as a senior research associate and Executive Director of Data Science. He was previously an associate professor of Mathematics and Executive Director of the Data Institute at University of San Francisco (USF). His research interests are in machine learning, signal processing, pattern formation, and dynamical systems. David is an associate editor of the Harvard Data Science Review. He was selected in 2015 by the National Academy of Sciences as a Kavli Frontiers of Science Fellow. He is also the founding Director of the BS in Data Science at USF and served as Director of the MS in Data Science program from 2014-2019. During the summer of 2018, David served as the Director of Research for the Mathematical Science Research Institute Undergrad Program on the topic of Mathematical Data Science.
Before joining USF he was a combined NSF and UC President’s Fellow at UCLA, where he was awarded the Chancellor’s Award for outstanding postdoctoral research. He holds a Ph.D. in Mathematics from Boston University and a BS in Mathematics from Harvey Mudd College.
Dr. Tanya Berger-Wolf
Topic
AI for Wildlife, Social Good and Data Colonization
Description
Increasingly, AI is the foundation of decisions big and small, affecting lives of individuals and the wellbeing of our planet, the source of income for corporations and the foundation of resource distribution for populations. Data-driven, AI-enabled decisions are also the hope of solving our planet’s biggest challenges, from climate change and poverty to pandemics and global crime. But if these solutions are to be trusted by those for whom they are intended, those who they affect the most, then the entire process of decision-making must be fair, just, inclusive, and participatory. The intended beneficiaries of the solutions must be more than mere data points or data providers but rather active partners every step of the way, from data to solution. I will show how this can work in the context of conservation. I will present an example of how data-driven, AI-enabled decision process becomes trustworthy by opening a wide diversity of opportunities for participation, supporting community-building, addressing the inherent data and computational biases, and providing transparent measures of performance. The community becomes the decision-maker, and AI scales the community, as well as the puzzle of data and solutions to the planetary scale, turning massive collections of images into high resolution information database, enabling scientific inquiry, conservation, and policy decisions.
Dr. Berger-Wolf will show how it all can come together to a deployed system, Wildbook, a project of tech for conservation non-profit Wild Me, with species including whales (flukebook.org), sharks (whaleshark.org), giraffes (giraffespotter.org), and many more. Read more: https://www.nationalgeographic.com/animals/2018/11/artificial-intelligence-counts-wild-animals/
and https://www.forbes.com/sites/bernardmarr/2021/01/29/the-amazing-ways-wild-me-uses-artificial-intelligence-and-citizen-scientists-to-help-with-conservation/
Bio
As a computational ecologist, Dr. Berger-Wolf works at the unique intersection of computer science, data science, wildlife biology and social sciences. She creates computational solutions to address questions such as how environmental factors affect the behaviors of social animals (humans included). She is also a founding member and project lead for Wildbook (a project of the non-profit Wild Me), an open source software platform that supports the use of AI, computer vision, citizen science and collaboration to accelerate wildlife research to understand and counter widespread wildlife decline.
Dr. Troy Hernandez
Title
Tools Enabling Data Science: Delivering Data Science with the Ramstack
Description
Many data scientists have been using markdown for years; e.g. Rmarkdown, Jupyter Notebooks, Slack/reddit comments, etc. The benefits of using Rmarkdown to deliver your data science insights include a human readable plain-text format, version control, and seamless integration of code.
In parallel with this development have been the proliferation of the Jamstack concept for web development. The “Jam” in Jamstack refers to using JavaScript, APIs, and markup to create websites. Benefits of the Jamstack include better performance, higher security, cheaper and easier scaling, and a better developer experience (https://jamstack.org/).
While many data scientists already host their blogs and data science projects via blogdown, the connection between these concepts remains underappreciated. This talk introduces the term *Ramstack*; i.e. generating Jamstack-style websites using R, APIs, and markdown. With a little effort, any data scientist familiar with markdown can now communicate their data science results at scale utilizing GitOps best practices… and, if necessary, moonlight on the cutting edge as a mediocre web dev.
Bio
Troy Hernandez is an American statistician and data scientist from Chicago, IL. He obtained his PhD in statistics (machine learning) from the University of Illinois at Chicago in 2013. Troy has applied his machine learning expertise to diverse fields such as virology, urban planning, and heliophysics. He is currently an IBMer, a community volunteer with the Pilsen Environmental Rights and Reform Organization (PERRO) and the Chicago R User Group (CRUG), and currently serves as the Chairman of the Cook County Green Party.
Jonathan Williams
Bio
Jonathan Williams is a Director of Curriculum Development at Trilogy Education Services, a 2U Inc. Brand, where he leads the design and development of technical learning content in the Data discipline. Jonathan has worked broadly in adult education as a learning experience designer, having designed courses at Harvard Business School, General Assembly, and New York University. Jonathan holds a BS in Mathematical Statistics from Wake Forest University, MS in Strategic Design and Management from Parsons School of Design, and is currently pursuing a Doctorate of Design (DDes) at North Carolina State University.
Dr. Mimi Tsuruga
Topic
Analyzing live data with Elasticsearch
Description
Elastic builds software to make data usable in real time and at scale for search, logging, security, and analytics use cases. In this workshop we will see how easily streaming data can be ingested in Elastic Cloud. We will then build dashboards and infographics using live data.
Bio
Mimi is an Education Architect on Elastic’s Certification Team based out of Mountain View, California. She teaches and develops instructor-led training courses, and maintains Elastic certification exams. Before joining Elastic, Mimi was an assistant professor of mathematics at the University of California, Davis. Her research area focused on developing algorithms applying topological methods in big data analysis, including genomic data and image processing. She also has research experience in computational topology, data visualization, and mathematical graphics working at the Technical University in Berlin, the Free University in Berlin, and Wolfram Research.
Rehgan Avon
Title
ModelOps Workshop with Ikonos Analytics
Description
How does a model go from the whiteboard to a mission-critical business asset? Organizations are innovating using machine learning, A.I., and other analytic solutions to drive their competitive edge. Teams face new and unique challenges operationalizing and governing these analytics solutions. ModelOps is the collection of best practices to deploy, govern, and optimize models in the enterprise.
This workshop focuses on how your team can leverage ModelOps at the strategic, tactical, and execution levels to optimize your models. Data Scientists, Data Engineers, Architects, and Project Managers will learn the skills and methods of ModelOps through hands-on exercises in this workshop. Learn ModelOps to unlock the potential of your models.
Challenges this workshop will address:
- Aligning teams on needs for model deployment, monitoring, and governance.
- Defining the abstractions for integrating models into the business workflows.
- Understanding roles and the handoffs between teams and functions.
- Governance, risk management, and sustainability of models.
Agenda:
- Introduction to ModelOps
- Roles and responsibilities
- Requirements for operationalizing a model
- Model Management – versioning, meta-data, lineage, & governance
- Model Life Cycle definition
- Continuous Improvement – Monitoring and Ops
Bio
Rehgan Avon | CEO & Co-founder
With a background in integrated systems engineering and a strong focus on analytical technology, Rehgan has worked on architecting solutions and products around operationalizing machine learning models at scale within the large enterprise. Rehgan was a critical piece in developing the fundamental process architecture for one of the leading product companies in model operations, ModelOp. Rehgan’s previous experience has been fueled by a passion for early-stage startups and product development. She is also Founder & CEO of Women in Analytics, a global community to support the visibility of women making an impact in the analytics field.