Developer at Atomist
Jessica Kerr is a developer and philosopher of software. At Atomist, she works on development and delivery automation: she writes code to let us write code to help us update and deliver code. At software conferences, she speaks about languages (Java, Scala, Clojure, Ruby, Elm, now TypeScript), paradigms (functional programming, DevOps), and now symmathesy. Her interests include resilience engineering, graceful systems, and Silly Things (which her daughters find on the internet). Find her work at blog.atomist.com, the podcast Greater than Code, and on twitter at her true name: @jessitron.
Collective Problem Solving in Music, Science, Software
There's a story to tell, about musicians, artists, philosophers, scientists, and then programmers.
There's a truth inside it that leads to a new view of work, that sees beauty in the painful complexity that is software development.
Starting from _The Journal of the History of Ideas_, Jessica traces the concept of an "invisible college" through music and art and science to programming. She finds the dark truth behind the 10x developer, a real definition of "Senior Developer," and a new name for our work and our teams.
Software Engineer at Softwaremill
I’m a passionate software engineer living in the JVM land - mainly, but not limited to. I also tend to play with electronics and hardware. When sharing my knowlegde, I always keep in mind that a working example is worth a thousand words.
How (Not) to Use Reactive Streams in Java 9+
Did you try to implement one of the new java.util.concurrent.Flow.* interfaces yourself? Then you’re most probably doing it wrong.
The purpose of this talk is to show that implementing them yourself is far from trivial and to discuss the actual reasons why they have been included in the JDK.
Reactive Streams is a standard for asynchronous data processing in a streaming fashion with non-blocking backpressure. Starting from Java 9, they have become a part of the JDK in the form of the java.util.concurrent.Flow interfaces.
Having the interfaces at hand may tempt you to write your own implementations. Surprising as it may seem, that’s not what they are in the JDK for.
In this session, we’re going to go through the basic concepts of reactive stream processing and see how (not) to use the APIs included in JDK 9+. Plus we’re going to ponder the possible directions in which JDK’s Reactive Streams support may go in the future.
Scientist at Microsoft Research, Cambridge
Sara-Jane Dunn is a Scientist at Microsoft Research, Cambridge. She studied Mathematics at the University of Oxford, graduating with a MMath in 2007. She remained in Oxford for her doctoral research, as part of the Computational Biology group at the Department of Computer Science. In 2012, she joined Microsoft Research as a postdoctoral researcher, before transitioning to a permanent Scientist role in 2014. In 2016, she was invited to become an Affiliate Researcher of the Wellcome Trust-Medical Research Council Stem Cell Institute, University of Cambridge. Her research focuses on uncovering the fundamental principles of biological information-processing, particularly investigating decision-making in Development.
The 20th Century was transformed by the ability to program on silicon, an innovation that made possible technologies that fundamentally revolutionised how the world works. As we face global challenges in health, food production, and in powering an increasingly energy-greedy planet, it is becoming clear that the 21st Century could be equally transformed by programming an entirely different material: biological matter. The power to program biology could transform medicine, agriculture, and energy, but relies, fundamentally, on an understanding of biochemistry as molecular machinery in the service of biological information-processing. Unlike engineered systems, however, living cells self-generate, self-organise, and self-repair, they undertake massively parallel operations with slow and noisy components in a noisy environment, they sense and actuate at molecular scales, and most intriguingly, they blur the line between software and hardware. Understanding this biological computation presents a huge challenge to the scientific community. Yet the ultimate destination and prize at the culmination of this scientific journey is the promise of revolutionary and transformative technology: the rational design and implementation of biological function, or more succinctly, the ability to program life.
Founder at Mesoica
Mark is founder of Mesoica, a data management firm working for the financial industry. Mark has 15 years of experience working in asset and wealth management firms and has focussed on how to make systems communicate better.
Crossing the bridge - how do we link end-user-computing and formal tech for data savvy teams
With Excel or custom tooling (Python, R, etc) there's flexibility to build data processing and preparation pipelines. Getting these to production level is often a different story as traditional or formal IT organisations are not well equipped to handle this kind of development.
In this talk, I'll show how we have combined SQL and NoSQL storage engines to create flexible and production ready data pipelines that can deal with unstructured data flows in an efficient manner.
Distinguished Architect at Verizon
Jon is a distinguished architect in Verizon, and the architect and one of the main contributors to of Vespa, the open big data serving engine. Jon has 20 years experience as architect and programmer on large distributed systems. He has a master in computer science from the Norwegian University of Science and Technology.
Big data serving: The last frontier. Processing and inference at scale in real-time
The big data world has mature technologies for offline analysis and learning from data, but have lacked options for making decisions in real time. This talk introduces vespa.ai - a mature platform for processing data and making inferences at large scale at end-user request time.
Offline and stream processing of big data sets can be done with tools such as Hadoop, Spark, and Storm, but what if you need to process big data at the time a user is making a request?
This talk introduces Vespa – the open source big data serving engine which targets the serving use cases of big data by providing response times in the tens of milliseconds at high request rates. Vespa allows you to search, organize and evaluate machine-learned models from e.g TensorFlow over large, evolving data sets. Among the applications powered by Vespa is the scoring/serving of ads in the worlds third largest ad exchange (Oath) and the online content selection at Yahoo, handling billions of daily queries over billions of documents. Vespa was recently open sourced at https://vespa.ai.
Developer Advocate at Oracle
Oleg Šelajev is a developer advocate at Oracle Labs working on GraalVM - the high-performance embeddable polyglot virtual machine. He organizes VirtualJUG, the online Java User Group, and a GDG chapter in Tartu, Estonia. In 2017 became a Java Champion.
GraalVM: Run Programs Faster Everywhere
Freelancer at Interdiscount
Born and grown in the beautiful Sardinia in Italy, I moved to Zurich to complete my studies at the ETH (Swiss Federal Institute of Technology).
After working on delay tolerant networks with Android devices I focused on Web development and scalable and resilient software architectures on the cloud.
Currently working as a Freelancer at Interdiscount, the market leader for electronics in Switzerland.
Serverless Continuous Delivery of Microservices on Kubernetes with Jenkins X
Jenkins X, the innovative K8s native CI/CD project is moving extremely fast. Recently it is embracing the Knative project and Prow for K8s in order to build and deploy polyglot apps using serverless jobs. This new approach might be the future of CD in the cloud for performance and reducing costs.
In the last few years, we witnessed big changes in how we actually build, deploy and run applications with the rise of Microservices, Containers, Kubernetes and Serverless frameworks. Those amazing improvements need a cultural shift based on continuous improvement in order to deliver business value and delight our customers.
But how could a team achieve this ambitious goal?
This talk will introduce the attendees to a revolutionary open source project, called Jenkins X Serverless, which attempts to achieve this goal. It is a reimagined CI/CD Ecosystem for Kubernetes built around Jenkins X Serverless, which leverages Prow and Knative serverless functions.
After this talk, attendees will be able to develop effectively in a cloud native way in any language on any kubernetes cluster!
Let’s be finally Agile!
PhD student at University of California, Riverside
Zach Zimmerman is a 4th year PhD student at University of California, Riverside. His research is focused on scalable time series data mining, in particular, using GPUs, distributed computing, and machine learning to enable scaling of time series motif discovery. His work is being used across multiple domains by researchers in both academia and industry. He has industry experience through internships with Google, Intel, and Nvidia working on various projects, mostly in the high-performance computing space.
From Billions to Quintillions: Paving the way to real-time motif discovery in time series
The matrix profile is a tool which encodes the distance of each subsequence in a time series to its nearest neighbor, this “all pairs nearest neighbor” is very useful for finding motifs and anomalies in just about any time series data and is being used by many in both industry and academia.
In this talk, I will explain the path of optimizations we took and the lessons we learned in developing a scalable solution for this “all pairs” problem in time series as well as introduce our current work in establishing a real-time, streaming approximation.
Software Engineer at Red Hat
Lili Cosic is a Software Engineer at RedHat, working on the operator-framework, enabling the community to make any application Kubernetes native. Previously she worked at Weaveworks on the Weave cloud integration with Kubernetes and before that, she found her passion for Kubernetes operators at Kinvolk helping develop the Habitat Operator. In her free time, Lili enjoys experimenting with Kubernetes, distributed systems, as well as writing operators for fun and not profit and dislikes writing about herself in the third person.
An intro to Kubernetes operators
An Operator is an application that encodes the domain knowledge of the application and extends the Kubernetes API through custom resources. They enable users to create, configure, and manage their applications. Operators have been around for a while now, and that has allowed for patterns and best practices to be developed.
In this talk, Lili will explain what operators are in the context of Kubernetes and present the different tools out there to create and maintain operators over time. She will end by demoing the building of an operator from scratch, and also using the helper tools available out there.
Software Engineer at Line+
Trustin Lee is a software engineer who is often known as the founder of Netty project, the most popular asynchronous networking framework in JVM ecosystem. He enjoys designing frameworks and libraries which yield the best experience to developers. At LINE+ corporation, the company behind ‘LINE’ the top mobile messenger in Japan, Taiwan and Thailand, he builds various open-source software, such as a microservice framework Armeria and a distributed configuration repository Central Dogma, to facilitate the adoption of microservice architecture.
Armeria: The Only Thrift/gRPC/REST Microservice Framework You'll Need
The founder of Netty introduces a new microservice framework ‘Armeria’. It is unique because it 1) has Netty-based high-perf HTTP/2 implementation, 2) lets you run gRPC, Thrift, REST, even Servlet webapp on single TCP port in single JVM, and 3) integrates with Spring Webflux and Reactive Streams.
Armeria is a Netty-based open-source Java microservice framework which provides an HTTP/2 client and server implementation. It is different from any other RPC frameworks in that it supports both gRPC and Thrift. It also supports RESTful services based on Reactive Streams API and even a legacy web applications that run on Tomcat or Jetty, allowing you to mix and match different technologies into a service which means you do not need to launch multiple JVMs or open multiple TCP/IP ports just because you have to support multiple protocols or migrate from one to another.
In this session, Trustin Lee, the founder of Netty project and Armeria, shows:
Quantitative engineer/data scientist at Facebook
Ghida works as a quantitative engineer/data scientist in the edge infrastructure team at Facebook London, where she builds data-driven tools and models, and perform in-depth analysis to drive the expansion and optimize the operation of one of the largest and most complex networks forming the internet. Many of the projects that she led aim at leveraging Facebook data insights to help build a more inclusive internet, through increasing internet penetration and the quality of experience witnessed by people worldwide online.
Leveraging AI for facilitating refugee integration
Today's world counts more than 20 million refugees, including over 1 million refugees resettled in Europe as a result of the ongoing Syrian civil war. When fleeing conflicts, refugees risk their lives with the hope of building a better future for themselves. However, upon resettling in a new country, refugees struggle to easily find the opportunities available to them and to filter the ones that are the most relevant to their profile and current context.
In this talk, we explain how AI can be leveraged to connect refugees in real-time and in a customized way to the opportunities that will accelerate the most their integration, bringing them a step forward towards the better future they strive to build for themselves.
Architect at Oracle
Ewan started out as a research scientist and then drifted into IT. These days he is an architect in Oracle’s EMEA Technology Cloud Team, has over twenty years experience in the technology industry and a lot less hair. He joined Oracle when they acquired Thor Technology in 2005. He intended to stay for six months and he's still there.
He is currently focused on helping Oracle’s customers and partners adopt a cloud-native approach to development.
Outside of work, Ewan is an active member of the Norwich Ruby User Group (NRUG) and Digital East Anglia. He contributes to a number of open source projects and is one of the organisers of the DevEast conference in the UK.
Free the Functions with Fn project!
“Serverless” is the hottest ticket in town right now.
But many serverless platforms restrict your choice of language and / or dictate where your code runs.
In this talk, I’ll describe how we can go to the serverless ball with open source and the Fn project in particular.
“Serverless” aims to improve developer productivity by abstracting, underlying infrastructure layers. The servers are still there, but you just can’t see them.
This abstraction allows the developer to focus solely on the functions that deliver value to the business and not on the plumbing.
The economics of serverless are also interesting since you only consume resources when your functions run, rather than having applications running continually waiting to server requests.
Sadly some leading serverless platforms are not open and restrict choice in terms of: - language - deployment
In this talk, I want to show how you can do serverless development with your choice of language, and deployment location.
VP of Engineering at The Workshop
Software Engineer and VP of Engineering at The Workshop. He holds a degree in physics and is passionate about data. He’s been living and working in Malaga for nearly 15 years.
Continuous Delivery in the Real World
More and more IT organizations are taking the step to Continuous Delivery instead of thinking in sprints or releases. It’s an important investment, and it opens a world of tangible opportunities. In this talk, we’ll see how the ability to deploy individual features influences the way we work, design applications, and perform as an organisation.
Senior IoT Evangelist at InfluxData
David Simmons is the Senior IoT Developer Evangelist at InfluxData, helping developers around the globe manage the streams of data that their devices produce. He’s been passionate about IoT for nearly 15 years and helped to develop the very first IoT Developer Platform before “IoT” was even ‘a thing.’ He’s always had a thing about pushing the edge and seeing what happens. David has held numerous technical evangelist roles at companies such as DragonflyIoT, Riverbed Technologies, and Sun Microsystems.
Pushing it to the edge in IoT
Where is the edge in IoT and how much can you do there? Data collection? Analytics? I’ll show you how to build and deploy an embedded IoT edge platform that can do data collection, analytics, dashboarding and much more. All using Open Source.
As IoT deployments move forward, the need to collect, analyze, and respond to data further out on the edge becomes a critical factor in the success – or failure – of any IoT project. Network bandwidth costs may be dropping, and storage is cheaper than ever, but at IoT scale, these costs can still quickly overrun a project’s budget and ultimately doom it to failure.
The more you centralize your data collection and storage, the higher these costs become. Edge data collection and analysis can dramatically lower these costs, plus decrease the time to react to critical sensor data. With most data platforms, it simply isn’t practical, or even possible, to push collection AND analytics to the edge. In this talk I’ll show how I’ve done exactly this with a combination of open source hardware – Pine64 – and open source software – InfluxDB – to build a practical, efficient and scalable data collection and analysis gateway device for IoT deployments. The edge is where the data is, so the edge is where the data collection and analytics needs to be.
Software Engineer at Red Hat
Alex is a Software Engineer at Red Hat in Developers group. He is a passionate about Java world, software automation and he believes in the open source software model.
Alex is the creator of NoSQLUnit project, member of JSR374 (Java API for JSON Processing) Expert Group, the co-author of Testing Java Microservices book for Manning and contributor of several open source projects. A Java Champion since 2017 and international speaker, he has talked about new testing techniques for microservices, continuous delivery in the 21st century.
Istio Service Mesh & pragmatic microservices architecture
We have been celebrating 2018 as the Year of the Service Mesh, where an open source effort known as Istio has taken and changed how we design and release our applications.
As we start to go toward cloud-native infrastructure and build our applications out of microservices, we must fully face the drawbacks and challenges to doing so. Some of these challenges include how to consistently monitor and collect statistics, tracing, and another telemetry, how to add resiliency in the face of unexpected failure, how to do powerful feature routing and much more.
Istio and service mesh in general help developers solve this in a non-invasive way.
In this session, we’ll show how you can take advantage of these capabilities in an incremental way. We expect most developers haven’t adequately solved for these issues, so we’ll take it to step by step and build up a strong understanding of Istio, how to get quick wins, and harness its power in your production services architecture.
Mainly Java Software Engineer / Consultant focused on distributed systems development adopting Reactive Manifesto and Reactive Programming techniques. Open Source geek and active contributor to Project Reactor / RSocket. Along with that, Public Speaker and Author of the book "Reactive Programming is Spring 5.0":
RSocket - Future Reactive Application Protocol
Are you doing microservices? Got exhausted of slow REST? Got mad of unreliable gRPC? An answer is RSocket. RSocket in a new network protocol with reactive streams semantic. It will make your system super fast and resilient. Come and learn why RSocket is the future of any cross-services communication
The new generation of cross-service communication is coming and called RSocket. RSocket is a new protocol that embracing Reactive Streams semantic into cross-service messaging.
This protocol enables backpressure-control and allows building canonical Reactive-System. Even though the protocol offers asynchronous messages’ streaming, there have already been a few competitors in this area by that time. One of those competitors is well-known gRPC. In this session, we are going to learn why RSocket is innovation solution for cross-server communication, can we compare it with gRPC at all and if can, what are the key differences between RSocket and gRPC and why we have to start using RSocket today.
Consultant at INNOQ
Lars is a consultant with INNOQ in Munich, Germany. He has been using Scala for quite a while now and is known as one of the founders of the Typelevel initiative which is dedicated to providing principled, type-driven Scala libraries in a friendly, welcoming environment. He is known to be a frequent conference speaker and active in the open source community, particularly in Scala. He also enjoys programming in and talking about Haskell, Prolog, and Rust.
Numeric Programming with Spire
Spire is a Scala library for fast, generic, and precise numerics. It allows us to write generic numeric algorithms, provides the ‘number tower’ and offers a lot of utilities you didn’t know you needed.
Numeric programming is a notoriously difficult topic. For number crunching, e.g. solving systems of linear equations, we need raw performance. However, using floating-point numbers may lead to inaccurate results. On top of that, in functional programming, we’d really like to abstract over concrete number types, which is where abstract algebra comes into play. This interplay between abstract and concrete and the fact that everything needs to run on finite hardware is what makes good library support necessary for writing fast & correct programs. Spire is such a library in the Typelevel Scala ecosystem. This talk will be an introduction to Spire, showcasing the ‘number tower’, real-ish numbers and how to obey the law.
Principal Engineer at Lightbend
Helena did her academic work in scientific research before getting in software engineering. Formerly at Apple working on platform infrastructure for distributed data/analytics/ml (aaS) at massive scale, VP of Product Engineering at Tuplejump building a multi-tenant stream analysis machine learning platform, Senior Cloud Engineer at CrowdStrike working on cloud-based realtime Cyber Security threat analysis, and Senior Cloud Engineer at VMware automating cloud infrastructure for massive scale. She is a keynote speaker, and has given conference talks at Kafka Summit, Spark Summit, Strata, Reactive Summit, QCon SF, Scala Days, Philly Emerging Tech, and is a contributor to several open source projects like Akka and FiloDB. She is currently a Principal Engineer at Lightbend.
Toward Predictability and Stability At The Edge Of Chaos
As we edge towards larger, more complex and decoupled systems, combined with the continual growth of the global information graph, our frontiers of unsolved challenges grow equally as fast. Central challenges for distributed systems include persistence strategies across DCs, zones or regions, network partitions, data optimization, system stability in all phases.
How does leveraging CRDTs and Event Sourcing address several core distributed systems challenges? What are useful strategies and patterns involved in the design, deployment, and running of stateful and stateless applications for the cloud, for example with Kubernetes. Combined with code samples, we will see how Akka Cluster, Multi-DC Persistence, Split Brain, Sharding and Distributed Data can help solve these problems.
Research Data Scientist at Fraunhofer IEE
Nicolas Kuhaupt is working on projects in the field of Big Data and Artificial Intelligence with the goal to shift forward the digitalization of the Energy Transition. In this endeavor, he has observed the upcoming of Deep Reinforcement Learning (DRL) and has already implemented DRL algorithms. He is thrilled by the possibilities DRL has to offer and is looking forward to spreading his passion for it. Additionally, he loves to join data conferences and is looking forward to meeting interesting people at JOTB and talk about data.
Getting started with Deep Reinforcement Learning
Reinforcement Learning is a hot topic in Artificial Intelligence (AI) at the moment with the most prominent example of AlphaGo Zero. It shifted the boundaries of what was believed to be possible with AI. In this talk, we will have a look into Reinforcement Learning and its implementation.
Reinforcement Learning is a class of algorithms which trains an agent to act optimally in an environment. The most prominent example is AlphaGo Zero, where the agent is trained to place tokens on the board of Go in order to win the game. AlphaGo Zero has won against the world champion which was thought to be impossible at that time. This was enabled by combining Reinforcement Learning with Deep Neural Networks and is today known as Deep Reinforcement Learning. This has shifted the frontier of Artificial Intelligence and enabled multiple complex use cases, among them controlling the cooling devices in the server rooms by google. Applying Deep Reinforcement Learning saved them several million in power costs. In this talk, we will understand the basics of Deep Reinforcement Learning and implement a simple example. We will have a look at OpenAIs gym which is the defacto standard for Reinforcement Learning environments. This will enable the audience to implement both an environment and Reinforcement Learning agent on their own.
Researcher at University of California, Riverside
Philip Brisk received the B.S., M.S., and Ph.D. Degrees, all in Computer Science, from the University of California, Los Angeles (UCLA) in 2002, 2003, and 2006 respectively. From 2006-2009 he was a postdoctoral researcher at EPFL in Switzerland. Since 2009, he has been with the Department of Computer Science and Engineering at the University of California, Riverside. His research interests include the application of computer engineering principles to biological instrumentation, FPGAs and reconfigurable computing, and efficient implementation of computer systems. He is a Senior Member of the ACM and the IEEE.
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security Vulnerabilities
Industry 4.0, aka the "Fourth Industrial Revolution," refers to the computerization of manufacturing. One important aspect of Industry 4.0 is the ability to monitor the health and reliability of a physical manufacturing plant using low-cost IoT sensors. For example, machine learning models can be trained to predict the physical degradation of a manufacturing system as a function of acoustic measurements obtained from strategically placed microphones; however, the same acoustic measurements can be used to reverse engineer proprietary information about the manufacturing process and/or precisely what is being manufactured at the time of recording. Thus, improved reliability and fault tolerance is achieved at the cost of what appears to be an unprecedented new class of security vulnerabilities related to the acoustic side channel.
As a case study, we report a novel acoustic side channel attack against a commercial DNA synthesizer, a commonly used instrument in fields such as synthetic biology. Using a smart phone-quality microphone placed on or in the near vicinity of a DNA synthesizer, we were able to determine with 88.07% accuracy the sequence of DNA being produced; using a database of biologically relevant known-sequences, we increased the accuracy of our model to 100%. An academic or industrial research project may use the synthetic DNA to engineer an organism with desired traits or functions; however, while the organism is still under development, prior to publication, patent, and/or copyright, the research remains vulnerable to academic intellectual property theft and/or industrial espionage. On the other hand, this attack could also be used for benevolent purposes, for example, to determine whether a suspected criminal or terrorist is engineering a harmful pathogen. Thus, it is essential to recognize both the benefits and risks inherent to the cyber-physical systems that will inevitably control Industry 4.0 manufacturing processes and to take steps to mitigate them whenever possible.
CEO at Ravenpack
Armando Gonzalez is President & CEO of RavenPack, the leading provider of big data analytics for financial institutions. Armando is an expert in applied big data and artificial intelligence technologies. He has designed systems that turn unstructured content into structured data, primarily for financial trading applications. Armando is widely regarded as one of the most knowledgeable authorities on automated text and sentiment analysis.
His commentary and research have appeared in leading business publications such as the Wall Street Journal, Financial Times, among many others. Armando holds degrees in Economics and International Business Administration from the American University in Paris and is a recognized speaker at academic and business conferences across the globe.
Big Data is the New Currency
The way data is collected, anonymized and monetized is large without the owner’s permission and this is about to be disrupted - providing many benefits to hedge fund data buyers. This presentation provides a pathway for the individual to control and share in the value their data creates, and for data users to gain access to richer more specific data sets.
Software Developer at Streamlio
Ivan is a software developer for Streamlio, where he works on Apache Pulsar and Apache BookKeeper. He's been involved with BookKeeper since it’s early days in Yahoo Labs Barcelona and also worked on the predecessor systems to Pulsar at Yahoo. His expertize is in replicated logging, distributed systems, and networking though often not at the same time.
Infinite Topic Backlogs with Apache Pulsar
The talk is about how Apache Pulsar can have topic backlogs of unlimited size, opening up a whole array of Big Data use-cases that are not possible with other messaging systems. We also delve into tiered storage, which can make these massive backlogs very cheap.
Messaging systems are an essential part of any real-time analytics engine. A common pattern is to feed a user event stream into a processing engine, show the result to the user, capture feedback from the user, push the feedback back into the event stream, and so on. The quality of the result shown to the user is often a function of the amount of data in the event stream, so the more your event stream scales, the better you can serve your users.
Messaging systems have recently started to push into the field of long-term data storage and event stores, where you cannot compromise on retention. If data is written to the system, it must stay there.
Infinite retention can be challenging for a messaging system. As data grows for a single topic, you need to start storing different parts of the backlog on different sets of machines without losing consistency.
In this talk, I will describe how Pulsar uses Apache BookKeeper in its segment oriented architecture. BookKeeper provides a unit of consensus called a ledger. Pulsar strings together a number of BookKeeper ledgers to build the complete topic backlog. Each ledger in the topic backlog is independent of all previous ledgers with regards to location. This allows us to scale the size of the topic backlog simply by adding more machines. When the storage node is added to a Pulsar cluster, the brokers will detect it, and gradually start writing new data to the new node. There’s no disruptive rebalancing operation necessary.
Of course, adding more machines will eventually get very expensive. This is where tiered storage comes in. With tiered storage, parts of the topic backlog can be moved to cheaper storage such as Amazon S3 or Google Cloud Storage. I will also discuss the architecture of tiered storage, and how it is a natural continuation of Pulsar’s segment oriented architecture.
Finally, if you start storing data for a long time in Pulsar, you may want a means to query it. I will introduce our SQL implementation, based on the Presto query engine, which allows users to easily query topic backlog data, without having to read the whole thing.
Creator of CQRS
Gregory Young coined the term “CQRS” (Command Query Responsibility Segregation) and it was instantly picked up by the community who have elaborated upon it ever since. Greg is an independent consultant and serial entrepreneur. He has 15+ years of varied experience in computer science from embedded operating systems to business systems and he brings a pragmatic and often times unusual viewpoint to discussions. He’s a frequent contributor to InfoQ, speaker/trainer at Skills Matter and also a well-known speaker at international conferences. Greg also writes about CQRS, DDD and other hot topics on codebetter.com.
The Bizarre Mating Ritual Of The Whipnose Seadevil
If you're an angler fish, you have it rough. You spend your life in the deep sea. It's lonely. Mates are hard to find. What do you do? If you're the male Whipnose Seadevil, you spend your life exclusively in search of that elusive, life long companion. You take this task so seriously that you forgo physical development and accept a stunted life -- that is until you fix yourself to a female, and release an enzyme that digests the skin of your mouth and her body, fusing you and your new-found love down to the blood-vessel level. And so you
become dependent on her for survival, receiving nutrients via your newly formed shared circulatory system. In return, you provide valued sperm.
And therein lies the secret to building great software.
In this talk, Greg Young will make the case that polyandry and parasitic reproductive processes should serve as the model for programming. You'l learn how the Whipnose Seadevil adapts pragmatically to its deep sea environment and manages to accomplish what most of us as programmers only dream of: reduced metabolic costs in resource-poor environments and improved lifetime fitness relative to free-living competitors.
Don't miss this opportunity to learn from one of software's great visionaries!