Data Council
Data Council
  • Видео 915
  • Просмотров 3 812 823
Building an Ecosystem for Open Foundation Models, Together
In this talk, Ce Zhang shares experiences in building the open source foundation model ecosystem through collaboration with the community. He delves into how balancing data quality, model architecture and infrastructure presents both opportunities and challenges. He also discusses navigating the extensive scale and cost of GPU clusters and optimizing their usage. Most importantly, he explores how data quality can be reasoned about in a structured manner to boost model quality.
This video provides a unique perspective on managing technical issues in open source ecosystems and is a must-watch for those interested in understanding the behind-the-scenes of data science and AI development.
👉 Si...
Просмотров: 261

Видео

Stochastic | AI Launchpad '24
Просмотров 2552 месяца назад
Stochastic is an end-to-end AI platform for enterprise knowledge work that provides personalized AI agents with zero setup or coding. ABOUT THE SPEAKER: Glenn Ko, Co-founder & CEO, Stochastic AI LAUNCHPAD: Data Council Zero Prime Ventures partnered to give six AI-first startups a chance to present brief demos on stage to top investors and elite founders during Data Council's annual conference i...
sea.dev | AI Launchpad '24
Просмотров 2802 месяца назад
sea.dev is breaking the constraints of existing data systems and NL2SQL with graph-based tools to allow LLM apps to reliably act on fintech data ABOUT THE SPEAKERS: Matt Arderne, Co-founder, sea.dev Marya Bazzi, Co-founder, sea.dev Vladimirs Murevics, Co-founder, sea.dev AI LAUNCHPAD: Data Council Zero Prime Ventures partnered to give six AI-first startups a chance to present brief demos on sta...
Phaselab | AI Launchpad '24
Просмотров 872 месяца назад
Phaselab builds smart automation to make companies’ data privacy programs more effective and efficient. ABOUT THE SPEAKER: Josh Schwartz, Co-founder & CEO, Phaselab AI LAUNCHPAD: Data Council Zero Prime Ventures partnered to give six AI-first startups a chance to present brief demos on stage to top investors and elite founders during Data Council's annual conference in Austin 2024. 👉 Sign up fo...
Parea | AI Launchpad '24
Просмотров 2822 месяца назад
Parea builds developer tools for evaluating, testing and monitoring LLM-powered applications. ABOUT THE SPEAKER: Joel Alexander, Co-founder, Parea AI LAUNCHPAD: Data Council Zero Prime Ventures partnered to give six AI-first startups a chance to present brief demos on stage to top investors and elite founders during Data Council's annual conference in Austin 2024. 👉 Sign up for our “No BS” News...
InQuery | AI Launchpad '24
Просмотров 2372 месяца назад
InQuery simplifies data lakehouse maintenance, saving your data team time and money. ABOUT THE SPEAKERS: Erick Enriquez, Co-founder & CEO, InQuery Khalil Miri, Co-founder & CTO, InQuery AI LAUNCHPAD: Data Council Zero Prime Ventures partnered to give six AI-first startups a chance to present brief demos on stage to top investors and elite founders during Data Council's annual conference in Aust...
Dataland | AI Launchpad '24
Просмотров 2042 месяца назад
Dataland is the AI-powered internal tools platform. It is the easiest way to deliver high-quality internal tools to your business users ABOUT THE SPEAKER: Arthur Wu, Co-founder, Dataland AI LAUNCHPAD: Data Council Zero Prime Ventures partnered to give six AI-first startups a chance to present brief demos on stage to top investors and elite founders during Data Council's annual conference in Aus...
Rising Tides with Radical Transparency: Why and How to Open Source Your Data Platform
Просмотров 1302 месяца назад
Join Tim Castillo from Dagster Labs for an insightful journey into how their data platform became successfully open-sourced. Discover the hurdles, cultural shifts and innovative implementations behind this strategic decision. Data engineers, analytics engineers and data platform engineers - learn how to leverage open source to enhance your projects and contribute to the data community. 👉 Sign u...
Case Studies from a Methodologist on an Experimentation Platform
Просмотров 3182 месяца назад
Dive into the world of A/B testing with Microsoft's Experimentation Platform Team. Join Laura Cosgrove for an exclusive tech talk where she uncovers the secrets behind Microsoft’s cutting-edge statistical evaluation and simulation frameworks. In this video, discover the power of Microsoft's variance reduction estimator and its game-changing impact on service efficacy. Ready to elevate your A/B ...
A 101 in Time Series Analytics with Apache Arrow, Pandas and Parquet
Просмотров 1,1 тыс.2 месяца назад
Dive deep into the world of databases and analytics in this talk from Zoe Steinkamp of InfluxData. Learn how you can unleash the potential of Apache Arrow and Apache Parquet for efficient, scalable handling of time-series data. Equip your toolbox with cutting-edge open-source technologies and industry-standard analytics libraries to build the foundation of a high performance analytics applicati...
Unified Stream/Batch Execution with Ibis
Просмотров 5352 месяца назад
This talk is a deep dive exploration into the powerful world of Ibis, as Voltron Data showcases their recent work merging batch and streaming concepts and introducing an Apache Flink backend. This comprehensive tutorial will provide you with invaluable insights for working with data across a variety of platforms. Watch the full video to explore the potential of a unified approach for both batch...
How Beam Uses Code-Based Dashboards to Scale Analytics Products
Просмотров 3192 месяца назад
In this talk, Emilio Tamez unravels the magic behind dashboards-as-code. From Python scripts to modular design, Beam is breaking down the barriers between complexity and simplicity. The dashboards-as-code methodology has allowed Beam to incrementally approach their goals by building boilerplate dashboards as a series of code-defined, standardized modules which can be arranged into a dashboard i...
Building Responsible and Trustworthy Generative AI Products at LinkedIn
Просмотров 5452 месяца назад
Dive into the heart of LinkedIn's commitment to ethical AI development, where revolutionary Generative AI meets responsibility. Listen in to this insightful exploration as Daniel Olmedilla unveils the foundational principles and architecture guiding LinkedIn's AI journey. With a special focus on their cutting-edge Generative AI products and features, this talk gives an exclusive look into Linke...
What Makes for an Effective Data Practitioner in 2024?
Просмотров 4152 месяца назад
Listen in as Marck Vaisman shares insights from his years of experience and demystifies the complexities of the data practitioner role, while providing a roadmap for skill development across all levels. Whether you're a seasoned leader aiming to upskill your team or a novice stepping into the realm of data, this video offers valuable guidance to propel your career in the right direction. 👉 Sign...
Is Kubernetes a Database?
Просмотров 5102 месяца назад
Uncover how Kubernetes extends beyond stateless apps and now supports stateful workloads and database management with Custom Resources. In this video, discover the potential to eliminate traditional databases by transforming the Kubernetes API into a potent database and metastore. Don't miss this chance to learn how leveraging Kubernetes can revolutionize your tech projects. 👉 Sign up for our "...
How Developers Should Think About the Emerging AI Stack | Together, Pinecone, Anthropic
Просмотров 5812 месяца назад
How Developers Should Think About the Emerging AI Stack | Together, Pinecone, Anthropic
From Playgrounds to Production: The Evolution of AI Evaluation at Coda
Просмотров 1002 месяца назад
From Playgrounds to Production: The Evolution of AI Evaluation at Coda
Events Sourcing with Kafka at Scale
Просмотров 1532 месяца назад
Events Sourcing with Kafka at Scale
Creating a Competitive Advantage in the Age of Intelligence as a Service
Просмотров 1102 месяца назад
Creating a Competitive Advantage in the Age of Intelligence as a Service
Build Faster, More Responsive Analytics with a Semantic Layer | Cube Workshop
Просмотров 2812 месяца назад
Build Faster, More Responsive Analytics with a Semantic Layer | Cube Workshop
Streaming CDC data from PostgreSQL to Snowflake, challenges and solutions
Просмотров 4672 месяца назад
Streaming CDC data from PostgreSQL to Snowflake, challenges and solutions
OttoBot: Productionizing LLM Models
Просмотров 1472 месяца назад
OttoBot: Productionizing LLM Models
Building a User-Level Targeting Platform
Просмотров 1392 месяца назад
Building a User-Level Targeting Platform
Data Culture 2.0: Leveraging AI to Build Human Connections and Expand Your Influence
Просмотров 982 месяца назад
Data Culture 2.0: Leveraging AI to Build Human Connections and Expand Your Influence
Beyond Kafka: Cutting Costs and Complexity with WarpStream and S3
Просмотров 2712 месяца назад
Beyond Kafka: Cutting Costs and Complexity with WarpStream and S3
Ten Years of Building Open Source Standards
Просмотров 2492 месяца назад
Ten Years of Building Open Source Standards
Move Fast and Don't Break Things -- How to Build a Data Platform that Scales with your Organization
Просмотров 3192 месяца назад
Move Fast and Don't Break Things How to Build a Data Platform that Scales with your Organization
Redefining Database Workloads: The Future with Modern Object Storage
Просмотров 992 месяца назад
Redefining Database Workloads: The Future with Modern Object Storage
Beyond MLOps: Building AI systems with Metaflow
Просмотров 6582 месяца назад
Beyond MLOps: Building AI systems with Metaflow
How to Align AI Capabilities with Product Strategy so You Can Innovate
Просмотров 2212 месяца назад
How to Align AI Capabilities with Product Strategy so You Can Innovate

Комментарии

  • @NiranjanAnandam
    @NiranjanAnandam 16 часов назад

    That's a perfect talk within such short time

  • @chrismcgrath7610
    @chrismcgrath7610 5 дней назад

    2nd Legendary talk, I can't remember how many years it's been since I last actually watched a tech video at 1x speed, and had my attention completely captured / enjoyed it, this was fascinating. This guy is in the Venn diagram of smart person, who knows how to properly present/communicate, and was willing to do the prep work. VS many other smart people suck at communication/presentation or aren't willing to do the prep work.

  • @Anhar001
    @Anhar001 6 дней назад

    all this jank just to solve the issue which is basically Python. Just write a fully statically compiled binary and shove that on a NFS, then just use rsync between dev machines and NFS. Have a shell script watch binary file changes and relaunch when file is changed. Look ma, I just replaced entire solid with a few bash scripts 😂

  • @jimshtepa5423
    @jimshtepa5423 7 дней назад

    10:55 what's wrong with uzbekistan?))))

  • @krishnapraveen777
    @krishnapraveen777 7 дней назад

    Chad engineer

  • @hemantishwaran5741
    @hemantishwaran5741 7 дней назад

    It’s great for ggplot and webpages. But if you ever write a textbook go straight to latex from the command line.

  • @malware_creations2606
    @malware_creations2606 8 дней назад

    Also I've read the Kafka has an issue with consumer lag. How do you handle those ?

  • @zuowang5185
    @zuowang5185 17 дней назад

    Is there an updated version of the logging pipeline 4 years later?

  • @bluejinux
    @bluejinux 20 дней назад

    One of the best presentations on what purpose of data warehouse and data lakehouse and where the future is going for data.

  • @randomhandle307
    @randomhandle307 23 дня назад

    Very nice. Thanks

  • @AndreaMontes_
    @AndreaMontes_ 26 дней назад

    I'm rewatching this talk, the speaker is quite good. Taking some notes to prepare my own talk

  • @hannahnelson4569
    @hannahnelson4569 Месяц назад

    Very cool talk! The idea of learning hueristics was very cool! I didn't quite understand how the criterion for splitting down multiple paths! I will check out the source code! Thank you for hosting this talk!

  • @fb-gu2er
    @fb-gu2er Месяц назад

    Backend in Python? Yikes

  • @guykerem7874
    @guykerem7874 Месяц назад

    One of the best talks on data in 2024. Thank you Abhi! You never miss a chance to inspire and impress

  • @tessafelice2181
    @tessafelice2181 Месяц назад

    I love the name mother duck. I feel it’s a respectful tribute to the female source of life and code.

  • @CreativeInspireP380
    @CreativeInspireP380 Месяц назад

    This was an extremely informative talk - especially the section on challenges - and one I wish would receive more attention due to how useful it is as an overview to quite a few complex and highly relevant issues. It would be nice if it were re-elaborated and presented in a non-live presentation format.

  • @the-ghost-in-the-machine1108
    @the-ghost-in-the-machine1108 Месяц назад

    thanks

  • @nosh3019
    @nosh3019 Месяц назад

    Great talk 🎉

  • @jayleejw1801
    @jayleejw1801 Месяц назад

    The amount of background noise in this video is absurd.

  • @tratkotratkov126
    @tratkotratkov126 Месяц назад

    Great, very much needed and promising project ! However, it is not quiet clear what do you mean when you are talking about data versioning (DV) - do you version the data as LakeFS does or you are just versioning the source code which is producing this data. Also the diagrams in the presentation (Virtual/Physical layers) I find confusing and not easy to grasp at first glance. It will be nice in the next iteration if you use some real world/practical entities to describe demo objects like customer, product, sales etc. instead of just “source” and wrap the demo in some quick story like “Meet Alex, the data engineer at TechCorp, a rapidly growing tech company. Alex is responsible for managing the company’s data pipelines, ensuring that data from various sources is clean, consistent, and available for analysis” etc. you got the idea. Finally I would suggest you switch the sequence and the time you spend on the theory and the demo part - show your fantastic open source project demo first and how easy is implementing the 3 concepts in meaningful story then after each segment just mention the theoretical part, but don’t allow the theory to consume 75% of your presentation unless you want to be considered as one of the many Data Governance “gurus” which are presenting on this channel. Whishing you all good luck with this fantastic project !

  • @LucasCardoso-mw4ok
    @LucasCardoso-mw4ok Месяц назад

    Hi! Nice video. I'm a little concerned about how I can get my development data from Copilot.

  • @KC53557
    @KC53557 Месяц назад

    A good example of not getting AI right is the creation of the Maga loon and Jan 6.

  • @68sahil56
    @68sahil56 Месяц назад

    30:29

  • @68sahil56
    @68sahil56 Месяц назад

    18:19

  • @VipulVaibhaw
    @VipulVaibhaw Месяц назад

    Fantastic talk!

  • @allthingsdata
    @allthingsdata Месяц назад

    Loved it.

  • @AshishKumar-ll2mt
    @AshishKumar-ll2mt Месяц назад

    Looks like this field never took off the way it should have

  • @yogeshbharadwaj6200
    @yogeshbharadwaj6200 Месяц назад

    Very nice demo..Tks..

  • @compilation_exe3821
    @compilation_exe3821 Месяц назад

    Amazing

  • @timothymcglynn1935
    @timothymcglynn1935 2 месяца назад

    HI 👋

  • @HikarusVibrator
    @HikarusVibrator 2 месяца назад

    If someone can explain to me how you’re supposed to do a major version DB upgrade with a Debezium connector. It’s such an unbelievable pain that it’s a total dealbreaker. Unless I’m missing something

  • @Eriddoch
    @Eriddoch 2 месяца назад

    Dang, Miriah you are an AMAZING speaker, and as someone who works on data engineering systems but doesn't own them (MLOps), this is really valuable.

  • @420_gunna
    @420_gunna 2 месяца назад

    bullshit buzzwords "cognitive analytics" vomit and a saccharine exhortative tone "quantum computing + graphene + ai" come on

  • @paoloogr
    @paoloogr 2 месяца назад

    Nice talk! Thanks.

  • @ex-cursion
    @ex-cursion 2 месяца назад

    I loved this and wish there was more of it. Thank you! But as noted: 'invoice reconciliation is boring'. I feel like the survival of our species will pivot not on our curiosity, but on our capacity to constrain our desire for novelty enough to solve boring problems.

  • @matthewborn
    @matthewborn 2 месяца назад

    This is an excellent talk. Thank you, Abhi!

  • @malcolmgdavis
    @malcolmgdavis 2 месяца назад

    Pointer vs. Value discussion: Based on the Method vs. Function discussion, ADT should be strictly adhered to. Operations that modify the ADT are modeled as functions that take the old state as an argument and return the new state as part of the result. In other words, a function should enforce immutability. The ADT approach helps with concurrency, making the code cleaner and easier to read. As an API user, I shouldn't worry about the state changing when I pass a structure. Of course, the pure ADT model's problem is memory consumption. That's why ADT models are generally implemented in VMs that can routinely find old structures without references and remove them from memory.

  • @malcolmgdavis
    @malcolmgdavis 2 месяца назад

    The method vs. function debate is absurd. The presenter needs to learn or spend time with OO programming. Class methods don't have to be logically connected to states. I developed in C during the 80s. The problem with structs is that the data is the point of coupling. The class hides data. In OO, the focus is on behavior and not the state. The OO state can be anywhere and can change. The strategy allows the implementation of the module to be changed without disturbing the client programs.

  • @1988YUVAL
    @1988YUVAL 2 месяца назад

    Very interesting presentation. Looks like a very well thought out solution for managing data transformations. I wonder if it will take off like dbt.

  • @Jack-lg9mq
    @Jack-lg9mq 2 месяца назад

    Good presentation. Also nice to see that Jimmi Simpson is expanding his horizons.

  • @mattbahr228
    @mattbahr228 2 месяца назад

    Awesome presentation!

  • @wonlee4138
    @wonlee4138 2 месяца назад

    Thanks for the great presentation!

  • @prashant776
    @prashant776 2 месяца назад

    Really good and informative. I congratulate PeerDB for their recent seed round secured . I see there is a lot of potential in PeerDB where organisations are looking to stream their data to warehouse. I have had a very unique need , I wish PeerDB was a wonderful choice back then.

  • @AndreaMontes_
    @AndreaMontes_ 2 месяца назад

    Great speaker 👏👏

  • @thrawn01
    @thrawn01 2 месяца назад

    This was super useful, I learned a lot, Thank you!

  • @IbraheemFaiq
    @IbraheemFaiq 2 месяца назад

    Great

  • @samhughes1747
    @samhughes1747 2 месяца назад

    I really enjoyed this. It was high-level, but hey, a hype-free, facts-only talk about working with generative models? I'll take it!

  • @Shikara_Animals
    @Shikara_Animals 2 месяца назад

    Best teacher ❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤

  • @VijayasarathyMuthu
    @VijayasarathyMuthu 2 месяца назад

    You should include LightDash

  • @whatSriBishnusRajDharmaN-ek1hl
    @whatSriBishnusRajDharmaN-ek1hl 2 месяца назад

    mother chods what doing here canot learn me detect leran mine concern your life risk at usa houston