3#8 - Alexandra Diem - Software Development: An Inspiration for Data Management? (Eng) Artwork

MetaDAMA - Data Management in the Nordics

This is DAMA Norway's podcast to create an arena for sharing experiences within Data Management, showcase competence and level of knowledge in this field in the Nordics, get in touch with professionals, spread the word about Data Management and not least promote the profession Data Management.
-----------------------------------
Dette er DAMA Norge sin podcast for å skape en arena for deling av erfaringer med Data Management, vise frem kompetanse og kunnskapsnivå innen fagfeltet i Norden, komme i kontakt med fagpersoner, spre ordet om Data Management og ikke minst fremme profesjonen Data Management.

All Episodes

MetaDAMA - Data Management in the Nordics

3#8 - Alexandra Diem - Software Development: An Inspiration for Data Management? (Eng)

January 08, 2024 • Alexandra Diem - Gjensidige • Season 3 • Episode 8

0:00 | 36:32

«The journey Software development went through during the last 10 years, working towards DevOps and agile development, is something that we can really benefit from in the data space.»

Uncover the synergy between agile software development and data management as we sit down with Alexandra Diem, head of Cloud Analytics and MLOps at Gjensidige, who bridges the gap between these two dynamic fields. In a narrative that takes you from the structured world of mathematics to the true data-driven insurance data sphere, Alexandra shares her insights on Cloud Analytics, Software Development, Machine Learning and much more. She illustrates how software methodologies can revolutionize data work.

This episode peels back the layers of MLOps, drawing parallels with the established tenets of software engineering. As we dissect the critical role of continuous development, automated testing, and orchestration in data product management, we also navigate the historical shifts in software project strategies that inform today's practices. Our conversation ventures into the realm of domain knowledge, product mindset, and federated governance, providing you with a well-rounded understanding of the complexities at play in modern data management.

Finally, we cast a pragmatic eye over the challenges and solutions within data engineering, advocating for a focus on practical effectiveness over the elusive pursuit of perfection. With Alexandra's expert perspective, we delve into the strategy of time-boxed approaches to data product development and the indispensable role of cross-functional teams. Join us for an episode that promises to enrich your view on the interplay between software and data.

Here are some key takeaways:

There is a certain push in the insurance industry towards data, AI and autiomation.
Gjensidige has over 20 decentralized analyst teams.
Data Mesh is about empowering analyst teams to take control over their data.
By taking responsibility over their own data, analyst teams take off the load from Data engineering teams, so they can focus on the tricky stuff.
MLOps, DataOps, or classic DevOps in the Data Space is about using System Development principles in the Data Space.
The questions that arise within data today, are questions that software engineering went through 10 years ago.
Software development also went through a maturing, that brought forth a domain driven focus, best practice focus, product thinking, etc.
Documentation should live, where the code also lives. It should be part of the code.
Introduce more software development best practices into the data teams.
Do not think about the solution you want to develop, but the problem you want to solve.
Time-box exploratory efforts into sprints.

The pitfalls

Software Development Lifecycle vs. Data Lifecyle – they overlap, but there are clear differences, especially in the late phases.
Feature-driven (or functionality-driven) vs. Data-driven: Is there a problem with software engineering mindset in data?
Hypothesis - Data Science vs. Engineering mindset: Explorational vs. structural thinking can cause friction
Environmental challenges: How does Test-Dev-Prod split fit with data?

Software Inspires Data Management in Development

Winfried Adalbert Etzel 0:00

This is Metadema , a holistic view on data management in the Nordics . Welcome , my name is Winfried and thanks for joining me for this episode of Metadema . Our vision is to promote data management as a profession in the Nordics , show the competencies that we have , and that is the reason I invite Nordic experts in data and information management for talk . Welcome to Metadema . I am excited today because we are deep diving into a topic that is talked about a lot , discussed a lot and I'm not really sure if it actually is a topic that really is relevant for data management . But let's dive into how software development can be an inspiration for data management .

Winfried Adalbert Etzel 0:58

I have with me today Alexandra , and Alexandra is working for Ian Seedige and Oslo , just a insurance company . She is quite active in the environment , talking about DataMesh on the DataMesh podcast together with Scott Herrmann . This is an episode coming out in December . She's also talking about ML Ops and active in the ML Ops community , also recording a podcast there , which is going to be really interesting to listen to . Lucky for us , she's with us today . Welcome , Alexandra .

Alexandra Diem 1:28

Thank you so much . Thanks for the invitation .

Winfried Adalbert Etzel 1:32

Sure , I think this is the first time I have a fellow German on the podcast , which is really nice . I promise we're not going to do this in German . We're going to talk about software development as an inspiration for data management . There are so many different topics we can dive into here . Software engineering has always been used as an inspiration to draw from A lot of the ways of working . A lot of the structures we have around . Data and data management have been modeled according to software engineering . We're going to dive a bit into what are the differences between those two , what are similarities and what can we actually use for our data management practice ? But before that , Alexandra , please introduce yourself .

Alexandra Diem 2:17

Yes , thank you so much for the nice introduction into both me and the topic . So my name is Alexandra Deem . I am the head of Cloud Analytics and Machine Learning Operations at INSIDIER , which is Norway's largest insurance company . I basically come from an academic background . Actually , I'm a mathematician .

Alexandra Diem 2:38

I've worked in academia for eight years in medicine building machine learning models on blood flow both in the brain and in the heart , and I've been using basically looking at data from various different angles and also software for quite a while .

Alexandra Diem 2:58

When I was kind of finished with academia , I started working as a consultant , because that seems to be the easiest transition for former academics into the world of industry , and I was working both as a Python software engineering project and data science project , and so it was during that time that I basically realized that there's software development methods and this journey that software development went through through the last 10 years working towards DevOps and agile development is something that we can really benefit from in the data space , and so eventually , insidier was . They apparently agreed with me on this because they were looking for a leader for a team on exactly this problem , and so my team now actually is a team of software developers within the data space , where we bring DevOps and software engineering best practices into the analyst teams that we have and teach them how to build data products that are easy to maintain and easy to build on and developed further .

Winfried Adalbert Etzel 4:11

Right , and I don't want to dive into the topic straight away . I want to get to know you a bit better . So what do you do when you don't work with data ? What are your hobbies ?

Alexandra Diem 4:21

Right now I'm not right now because I'm in the office now , but at home I was looking now at the first 30 centimeters of snow , which I'm extremely excited about because that means the ski season is about to start . So that will probably be my number one activity away from the screen . During the winter and during summertime you'll find me mostly on two wheels , on either a road bike or a gravel bike , exploring Norway , which , in my opinion , is the most beautiful country that we have in Europe .

Winfried Adalbert Etzel 4:53

Oh , I definitely agree with you on that one . When you started out in academics , where did your interest for the data world come from ?

Alexandra Diem 5:01

Yeah , that's an interesting question that I actually had most . So I was actually working on , basically , models of partial differential equations , trying to describe blood flow in various parts of the body , and so it was more like on the data producing side . But then also there are in these models you have , so many parameters that you need to determine somehow that you just can't measure them . What you do in the end . You use a supercomputer and basically do a larger parameter sweep to try and get results on a higher level that match experiments that my medical colleagues could copy , so that I could actually convince my medical colleagues that the models that I built with the parameters that we're using within a certain confidence interval actually are believable . So that's where I started out , analyzing larger amounts of data . Also , I dove into the issues of tracking data properly and knowing what my versions of data are , which code produced which data and how I should handle those . So lots of mistakes were made there .

Winfried Adalbert Etzel 6:11

Yeah , and then you are just right in the middle of it , right the middle of the data world , and Yem Sidige is also there . It's not the first insurance company I talked to on the podcast . We had a couple other insurance companies on and talked about various topics from artificial intelligence on the one side , geore , automation of processes , but also about data strategy . So I feel like there is a certain push in the industry , and especially in the sector , towards data driven , towards AI , towards implementing data as a way of problem solving . How do you see the state of data in Sidige ?

Alexandra Diem 6:52

Yeah , so if there's one industry that I would call truly data driven , it is definitely insurance . So data is so central to basically the way we price our products and the way we also make new products and offer new products , and it's so inherent that actually we have our data . Teams are very decentralized , they sit within the domain , and so we have over 20 analyst teams that sit basically within the different business domains of private insurance , commercial insurance , claims pricing and so on and so forth . So we have a lot of analysts . I think it's over two , yeah , maybe two to three hundredish . So it's extremely central to us .

Winfried Adalbert Etzel 7:41

And since you also been talking to Scott about data mesh as a concept , how do you see data mesh as a concept for the industry and for Sidige ?

Alexandra Diem 7:51

Yeah , so it basically comes down to the number of analysts teams that we have , because basically what we see is we do have , as we come from , a central data warehouse , like most companies do , with a central data engineering team that take care of this data warehouse . But then eventually you see this problem . That's basically with the number of analysts teams rising linearly . The tasks for data engineers they rise more like exponentially . So this really isn't quite so sustainable in my opinion . Exponential growth rarely is , which , as we've seen during the last couple of years . So data mesh for me is basically all about handing some responsibility for data back to the analyst teams or , as I rather like to call it , empowering analyst teams to basically take charge of their own data and taking more responsibility and ensuring that . So basically taking some of the load of the data engineers so they can work more as an enabling team for the really tricky cases and basically so everyone can use their expertise to the maximum benefit of the company .

Winfried Adalbert Etzel 9:07

I think you end up in this trend when we say stages of maturity .

Winfried Adalbert Etzel 9:12

If you are a low stage of majority and you just start out on your journey , usually you start with a data scientist or data analyst and then at one point you realize that wait a second , I kind of need some data engineers to actually get that working .

Winfried Adalbert Etzel 9:26

I can't expect to data scientists to do all the work . And then you get to the stage where you are describing . You have the teams established in your domains decentralized , but you have a central team that has to support all of them and they become kind of a bottleneck for operations because everything has to go through that team . And then you have handovers , you have handshakes between , while the data analyst on the decentralized domain team has to talk to your engineer to describe his problem , the engineer probably has to talk to some other engineer in another part to describe the problem once more , and then you kind of have a tiny swiss ball game , right ? There's one argument that comes a lot when you talk about data mesh is that data mesh is all about scalability , and I agree on that . But you have to come to a certain size of company , a certain size of complexity that it makes sense to scale in a data mesh fashion . What would you say , is that limit or the threshold ?

Alexandra Diem 10:25

Yeah , that is a very , very good question because we come actually difficult to answer for me because we come from the completely opposite end .

Alexandra Diem 10:32

So we have this week , right , we come from this pretty big data warehouse that we would rather we don't really want to split it up , because I am not a fan of basically doing work over when it's already done but basically we're just trying to find new data sources or new data products , then letting the analyst team take charge of those .

Alexandra Diem 10:57

On the other hand , of course , if you only have one data product and you don't have a data mesh and it's actually At the end of the day , it's not so important for me per se whether the data mesh is actually a mesh in the sense that you have connections between absolutely every team to every other team , but it's more about I'm focusing a lot more about the four principles of data mesh , and so you can follow those with basically two teams , right , as soon as you have two data products , you can start following those principles and build from there , and then you could call it a data mesh because , yeah , you're following basically the most important aspects of it , and so for us , basically , in that when we take this more pragmatic approach .

Alexandra Diem 11:48

We will most likely end up more in a sort of hub and spoke kind of situation where we still have a lot of our central , a lot of the central data warehouse that we already have , and then data products around it , with some ad hoc combinations between data products where they , where they are appropriate , but most connections basically coming into the central hub .

Winfried Adalbert Etzel 12:11

One last question on the data mesh topic and I'm going to move over to software engineering , but just because I think it's really interesting , because many companies are in a situation now that you're looking at what are the benefits of a data mesh for us and what are the pitfalls . Basically One . One thing that always gets a bit confused , I think , is the whole data platform part of it . When you talk about self service platform , how do you think and I think with central data warehouse is not a wrong setup at all especially then you have that domain agnostic platform that is easier to self serve to everyone than to have domain specific platforms . But what do you think is the key element on the platform side ?

Alexandra Diem 12:58

Yeah , I'd say onboarding is , after my experience now , actually the one single key .

Alexandra Diem 13:06

So for us , we interpret the basically self service platform parts as a pure , basically on the pure technical side of the platform .

Alexandra Diem 13:18

So , for example , which , which kind of platform you use maybe Databricks , maybe Snowflake , or you are waiting for Microsoft Fabric to come out ? Yeah , so I'd say the most important part is definitely onboarding of the analyst teams onto the platform . So we , the platform itself , for us is we interpret that part as purely from the technical perspective of which platform , which platform we're building . So the , the data warehouse , for example , is we see it more as a data product , also within within the platform . But when it comes to onboarding , we found out that basically my team especially we have a very different approach to what other teams would consider onboarding . So for us it's very important that you actually take analysts by the hand almost and show them how you develop a data product from end to end and what , actually , what is it a data product looks like when you've built it end to end , and what is it that you need to be able to publish it such that other data teams are able to use it further on

ML Ops and Software Engineering Principles

Alexandra Diem 14:23

Winfried Adalbert Etzel 14:23

Really good . Thank you , and I think you you are onto a lot of the topics that I find particularly interesting . We talk about documentation , we talk about structure and governance , and I'm not going to deep dive into that . We want to move over to to software engineering as an inspiration for for our data , data management and you talked about or you mentioned it at least ML Ops . As part of your responsibility , Maybe just do a quick definition and recap of what what you understand under ML Ops ?

Alexandra Diem 14:53

Yeah , so I don't I'm not sure whether I have a clear cut definition and also , from my perspective , you may as well call it data ops or just plain DevOps If you , if you want to be very much back to the roots . But it's basically , it's really just that the the the application of DevOps principles into the data space . It could be onto data product or machine learning , machine learning models , and so it's basically using software , software development , best practices for operationalizing anything that you want to run in production , such that it is easy to maintain and easy to build upon . So this includes , for example , continuous development and continuous integration , automated testing , general automated orchestration , a bunch more .

Winfried Adalbert Etzel 15:45

And with that we are in the middle of the , the inspirational factors from software engineering , on on the , to the data , worse , and what . What do you think is , why is it why ? Why is it so important If they've drawn so much inspiration from software engineering ?

Alexandra Diem 15:57

Yeah , I think it's . It's tiny because we can we can actually die translate what data products are directly into what software developer software products also are , and so they need this . They have the same data products , have the same need for basically testing and optimization and trying to an orchestration process that basically allows for a smooth deployment . So it's it's I'd say it's simply a lot of the same problems and questions that we have in the , we have in the data space that basically software engineering has have been going through over the last 10 years when they started going from working from big monolithic software projects into microservices .

Winfried Adalbert Etzel 16:46

And this is kind of at least from my experience , a interesting sector that the companies that are furthest on their way to to data mesh implementation , to an an ops implementation . They also went through those cycles earlier , for example through the microservice implementation right . What , what do you see are the key elements also to the software engineering processes that you can translate directly to to data ?

Alexandra Diem 17:14

Yeah , exactly it's actually , in my opinion , when you look at data mesh , it is basically it's . It's exactly a just a translation of the same principles from software engineering into the data world . So in software engineering , you also need to be a domain driven . You need software engineers need to have the domain expertise for what they are developing and they need to think . They need to think in terms of product and basically develop X as a product . They need self service platforms to deploy their product so they don't rely on an operations team to do the deployment on a certain date at a certain time for them . And you also need federated governance , which I'd say in a software world , I'd rather call it best practices on how you write code , because like 80% , maybe 70 to 80% of your time you'd actually spend reading other people's code instead of writing your own code , and so then it would be really rather nice if everyone has learned basically to write the same kind of style of code so it's easy to read others .

Winfried Adalbert Etzel 18:20

Really good . So we have the best practice part , we have the data governance part , we have the domain knowledge part . That is quite easy to compare , and we also have which I'm particularly interesting as the product thinking part , which is still kind of new in data . It's not that long ago that we started to think about product management principles in data and then thinking of data products , and , yes , this is also one of the key elements in a data mesh , though data product thinking has been there for longer . So I want to go a bit into a couple pitfalls that always come up when you talk about comparing practices and then one space to another , because there are differences , obviously , as much as there are things that we can compare . So let's go through a couple and just see what your , what your opinion is about this and one that is coming up quite a lot , especially from the data side of things , and we also dedicated a known podcast episode on the topic of life cycle , and in the version one of the DM book , which is our body of knowledge , the data management body of knowledge , there was a quite nice drawing about the comparison between the software development life cycle on the one side and a data life cycle on the other side . I don't know why they took it out on the second version , but I really enjoyed this one .

Winfried Adalbert Etzel 19:45

And there is there is a certain overlap in the development life cycle but there are clear differences , and especially when it comes to the late phases . So once you have delivered your , delivered your product in a software engineering or software product development setting , it kind of goes into operation and it will be maintained . But there in a data life cycle , you have a couple other phases in the late stages archiving stage , which has its own complexity . We have a perching stage at the end . How do you get rid of your data that you don't need anymore , shouldn't have anymore ? Are there any problems that might occur if you try to fit everything into a software development life cycle ?

Alexandra Diem 20:29

Yeah , that is a tricky question , but , yeah , for sure , I mean the principles . I think if you follow the principles of basically trying to automate as much as possible , you will get . You will get away with with a lot of you will . You will be able to get away with a lot . I would say , and I think one of the one of the central aspects that are more relevant to data , maybe then software , is monitoring . They need to have a lot .

Alexandra Diem 20:56

So you , to ensure that you have the appropriate reports , all notifications set up on all of the all of the aspects of a data life cycle , and you probably need to , you probably need to be a little bit more vigilant on a with a manual walkthrough of what data you're still using and which data you aren't using . So there , one way you could , for example , achieve that is by basically being very smart about how you exactly go build your staging models and you go to the views or the analytical base tables ABTs that you're actually using in your , in your models . So if you , if you work on basically keeping those to the minimal that you actually need and then you have a good tag that helps you track where data is used and where it's coming from , then I , if you have those things , to the mechanisms in place and I think you can achieve a lot with the software development life cycle process .

Winfried Adalbert Etzel 21:56

I think one of the best examples for late stages is work for National Archives a while back , and there are certain , certain documentation or certain data that you have to keep for what ? 10 , 15 , 20 life of business , much longer time period than than you have to . You have your applications running right , so you end up in like a five year cycle of migrating to a new platform or a new application to keep that data alive . And every time you migrate you do certain changes to the data that you need to document it . You need to ensure that you can , you can track throughout . And then you are in the middle of the entire Linux discussion . I don't want to go there , but is there something in the way you process , the way you document throughout the life cycle , that from the software development that can help you in those processes ?

Alexandra Diem 22:47

Yeah , so for us , we basically we're trying to move to a process where basically documentation lives exactly where the code also lives on . Together is part of the code actually , because , yeah , and so after development , one of the biggest problems actually is not a lack of documentation , but it's outdated or wrong documentation , and that happens automatically as soon as you have to , as soon as documentation and code are kind of separated or , in the worst case , if you have two places where you need to update the documentation

Challenges and Solutions for Data Engineering

Alexandra Diem 23:22

. And so first we was was solving this problem , for now at least , by using dbt , which we think is a is a pretty , pretty good tool , because it basically it allows you , it basically turns your entire ETL processes into software , into a software product , and so you document , you document exactly in your data models what it is you actually you're actually publishing there and then you get a documentation page automatically . So it's very , the very automated way of writing ETLs . And then it's also I like to call it basically SQL Spice , stuck with Python , and so you actually get some , some software , software development capabilities on top that you don't natively have available in SQL . So we think that's pretty neat .

Alexandra Diem 24:16

But yeah , we this basically the move to dbt basically came also from the observation that we didn't have previously . We didn't have a great lineage process in place , and so , for example , there could be a model within the division , within the private insurance division , that suddenly wasn't running anymore Because , yeah , a table had been removed somewhere , because the original , the team who originally ordered that table , didn't need it anymore and wanted to outdated , while without having realized that now other teams were also using that table . And so this , this issue , we can really drastically remove by having this nice , this nice lineage and also much more appropriate documentation exactly in place where we build the data models and testing answer .

Winfried Adalbert Etzel 25:12

All right , let's go to pitfall number two Peter driven or functionality driven versus data driven . So we'll talk a bit more about the data engineering mindset and how it fits the data mindset . And when we talk about feature driven , we talk about move from feature to feature and build up our software solution incrementally . When we talk about data driven , you kind of have to have a holistic view of how this data is collected , used , even purged . We talked about the latest edge already . So how does that fit together ?

Alexandra Diem 25:47

You have all the difficult questions , don't you ? Yeah , though I like to actually think more in an even different kind of perspective , and that is problem driven . So where she ?

Alexandra Diem 26:00

In this process of basically trying to introduce more and more software best practices into our data teams , we are trying to teach them to think about hypotheses on what they're trying to achieve and then coming up with a minimum viable model , the sense of a minimum viable product that would test this hypothesis and either negate it or continue to support it .

Alexandra Diem 26:28

So we're really trying to shift the focus from people thinking about data solution that they want to develop more into thinking about which problem they're trying to solve , and then actually it can happen . It has happened that we had a hypothesis on that a specific problem should be solved by a Python application on top of the DBT , etl processes that we implemented , so that we can then serve it into a model , and it turned out when we started , when we did that process problem based , it turned out that we had absolutely no need for the Python based application at all and we could solve the entire problem just with DBT , and so we just made it as simple as possible . And then sometimes yeah , of course then our solutions weren't end up being the sexiest machine learning solution that we maybe envisaged before , but at least we solved the problem , and that's much more important for us .

Winfried Adalbert Etzel 27:42

We really enjoyed that one and it leads us right next to bit for number three data science versus engineering mindset . And we talked about hypotheses already and having a inspirational approach or more of an academic approach to to solving problems than a structural approach , that you have an engineering mindset and I think that can cause friction and misunderstanding . It's on the on the quite technical level already , if you talk about how does the data scientist write code compared to data engineer writing code , there is a difference up to to a high level picture of how do I get from a question to resolved , so how do you solve that ?

Alexandra Diem 28:25

Yeah , this is also something that I've basically , I remember I experienced when I was a consultant in a data science project and we were basically , yeah , we were trying to figure out how to , what kind of solution we should , we should work on , and then we started with the data exploration and as data scientists , so we were as we were , we could have done data exploration until , yeah , for a long time .

Alexandra Diem 28:49

Basically , one could search , one could search for for as long as one wants , but at the end of the day it's like and I don't think this process should really be and should be eliminated , because that's where you find the really interesting problems or the interesting potential solutions .

Alexandra Diem 29:05

So here we rather try and introduce a process of time , boxing these kind of efforts into sprints again , like like you would do in software engineering and software engineering . Of course , most of the problems are more like straightforwardly defined and can be , can be on just the matter of basically implementing them and having enough manpower or time available to implement them . But we can use the same process for the explorative phases . We can say we work towards this particular data set and investigate which variables might be interested , could , yeah , which variables could give us , could give us new interesting insights that we haven't had before . And just as long as you time box it and don't basically don't dive all too deeply into just one part of the data sets at a time , then you'll be able to balance basically the hypothesis driven , actually data development with the explorative phase and trying to find new exciting problems .

Winfried Adalbert Etzel 30:13

This is a really good approach to time box those interest . Print sender . I think that one thing that is a bit about understanding of the data science process that still is lacking , especially in senior management , that is that our hypotheses can be wrong and you can end up using time and using energy and resources on something that doesn't give you the result that you're looking for , but then you know it's not going to give you the results you're looking for , and if you time box it into a sprint , you also limit the resources used on on that .

Alexandra Diem 30:44

No , exactly that's also . That's that's exactly the approach that we're trying to also communicate . A negative result is not a negative result . A negative result is a very positive result because , like in mathematics , basically you can only with with 100% . Certainly you can only disprove most things , a few things you can prove with certainty , but most things you can only really disprove with a lot of set with 100% certainty . And so once you've done that , you know it doesn't work . So you can actually just stop spending more resources on that particular problem , or you figure out that you need more data , and then you would know where to look , maybe potentially look for more data , but at least you know what to do next .

Winfried Adalbert Etzel 31:27

All right , let's have the last pitfall , and this is a bit more about the environment that we are moving in and especially in . So for engineering , there's a classic test development production split where there has been a lot of talk about how does that actually fit with , how do we , how we work with data , and it just doesn't really fit . So there are certain challenges there and especially when we come into the discussion is why is it so hard to move data projects into production out of that endless loop of pilots and POCs ? And maybe the environmental setup that we have chosen for data engine from data or software engineering perspective is not the right fit .

Alexandra Diem 32:17

Yeah , so that's a very important problem and , yeah , we've we've also learned this kind of the hard way that , yes , we can't really have a separate development environment for for data products like we can have in software engineering product . But what we found is a relatively well working solution so far up to now is basically using using schemas as environment , where you basically start off with , yeah , for example , so if we develop , every analyst for us will create their own development schema where you start working with your staging models and you create views on the basically on the tasks that you're working on , and then we just we just need to remove those schemas into tests , basically , and we have a test schema . That is then . That is then shared between all the analysts in that particular team and in the end , we move to a production schema . So that means by by doing it that way , we can still have development tests and production environments , but they're all working on the same live hot data and they're all using the same .

Alexandra Diem 33:29

They're all basically using the same code and our code is most of our code is not , so it's not that complex . It's basically it's not the most complex code , so it's not . It's not there that it's super important to basically to have the separate environments we do . Also we encourage to have a small , basically small branches that emerged in , often into main , such that if there is a problem then you can just reverse and and the problem solve again . Need to try again with a new branch , but the first , mostly what what has been working very well , is basically focusing the different environments , on the , on the data model .

Winfried Adalbert Etzel 34:10

So I think this is a really good approach to also be more model driven . We talked about what we talked about , teacher driven at one point . I think there's also a way to be more focused on model , on your models , and then also on the model life cycle . So really good way of summing up all the four pitfalls . Really enjoy that . We went through a lot already there . Probably more pitch balls out there and the problem , more things that you could you could talk about . Is there anything that you find particularly interesting that we haven't covered yet ?

Alexandra Diem 34:43

Let's see . I think we've agreed with , managed to get through quite a lot , so there's not there's not anything that jumps at me right now . I think we can give ourselves some kudos for actually having covered , covered a lot of ground

Emphasizing Pragmatism and End-to-End Solutions

Alexandra Diem 34:59

here . But really the most important thing for me is that I'm always trying to convey is the pragmatic which , with whichever methods or way of working you're trying to , you're trying to implement it's 80% of the effort , gets you , gets you there most of the way , and it's much better to do basically 80% of the effort than struggling to do the last 20% and trying to squeeze it into your organizational architecture or set up that you're working with . So be pragmatic would be one of the one of the key emphasis that I would like to put on this .

Winfried Adalbert Etzel 35:44

I really like that . Perfect , I mean , you're good , right ? Thank you so much for a great conversation . I really enjoyed this one , and is there anything that you want to say at the end , before we finish off ?

Alexandra Diem 35:56

And just , yeah , maybe just give it a go , Try and time , box your your data product and try and think about problems developing , problem developing solutions for real problems , end to end , and test the process . Have a yeah , have cross functional teams that basically can , can produce the entire value chain , and then you should all be good to go .

Winfried Adalbert Etzel 36:23

Thank you so much .

Alexandra Diem 36:24

Thank you .

Winfried Adalbert Etzel

Host