MetaDAMA - Data Management in the Nordics

3#7 - Heidi Dahl - Transforming Business with a Strategic Approach to Data (Nor)

Heidi Dahl - Senior Data Scientist Posten Bring AS Season 3 Episode 7

«Sentralt I dette med å skape verdi er tverrfaglighet og involvere hele bedriften, ikke bare et lite Data Science miljø.» / «Central to creating value is multidisciplinarity and involving the entire company, not just a small Data Science environment.»

Prepare for a journey into the landscape of data strategy with seasoned Data Scientist, Heidi Dahl from Posten Bring, one of the largest logistics organizations in Norway. She is not just engaged in strategic discussions about data, AI and ML, but also a passionate advocate for Women in Data Science, took the initiative to create a chapter of WiDS in Oslo, and co-founded Tekna Big Data.

In our chat to understand the  dynamics of data science and IT, we talk about their balance between research and practical development. Heidi articulates the urgency for a dedicated data science environment, exploring the hurdles that organizations often confront in its creation.

We cross into the world of logistics, shedding light on the potential power of data science to revolutionize this industry. We uncover how strategic use of data can streamline processes and boost efficiency. Finally, we underscore the importance of nurturing an environment conducive for data professionals to hone their skills and highlight the role of a data catalog in democratizing data accessibility.

Here are my key takeaways:
Digital Transformation of Posten Bring

  • An organization that is 376 year old and has been innovative throughout all of those years.
  • The Data Science department was stated in 2020 under Digital Innovation, now a part of Digital technology and security.
  • The innovative potential is found through use-case based work closely integrated with the business domains.
  • There are several algorithms that made their way into production, and that is a goal to measure against.
  • The Data Science teams consist of cross-functional skillsets, bringing together Data Science, Developers, Data Engineering and Business users.
  • The exploratory phase is vital, but has to have a deadline.
  • IT driven development projects do not always match with the needs of Data Scientists.
  • Data and IT need to work together, but for exploratory work, Data Science should be able to set ut needed infrastructure.
  • On cloud infrastructure it can be vise to think multi-cloud to ensure availability of a specter of relevant services.
  • Posten/Bring is looking to build a digital twin for their biggest package terminal for better insight, control and distribution of packages.

Strategic use of data

  • How can we use data to make better decisions, be more effective and smarter?
  • The 4 core elements of the Data Strategy:
  1. Establish distributed ownership of data and data products
  2. Increase the amount of self-service.
  3. Build competency tailored to your user groups needs.
  4. Strive towards the goal of great services and products based on data for your users and customers.
  • Role based self-service capabilities .
  • A data catalog is discussed, to gain a better understanding of the data available, security, but also context of origin and data lineage.
  • A data catalog needs to be able to serve different user needs.

Competency

  • There are three perspectives:
  1. How to recruit new and needed competency?
  2. How to train and share competency internally?
  3. How to retain competency?
  • Data Engineer is a newer and more specialist role, that is hard to find on the market.
  • You need to give your data professionals the possibility to do purposeful work, bring into production and connect to value creation.
  • The entire organization should be aware of how to use data to make work more efficient and smart - think data literacy
Speaker 1:

That's the Metadama, a highly-public podcast on data management in Noden. Hi and welcome. I'm Winsried, and thanks for joining the new episode of the Noden podcast, where we show you how data management in Noden is running, how you can advance your competence in the field of artificial intelligence and therefore invite my Noden expert in data management and information management to talk to him. Welcome to today's episode of Metadama. The two things I'm very happy about today are that one is to bring you to my HIDIDAL podcast and talk about data strategy and where you can use data more strategically in society. The other thing I'm happy about is that we finally have an episode in Norfga, which is always very nice.

Speaker 1:

What we're going to talk about is data strategy. What does data strategy actually mean? There are a lot of words that data strategy is what actually makes you work with data, but how do you do that? How do you define a strategic need for a society? How do you define a connection between data and the organization, and how can you create data with data without any information? What is the meaning of that? There is also another perspective here, and there are many social conversations about data management, but how can you become a data director without knowing the goal and data strategy can really help on this path and designate what data director means for your company and at least this is the way to ensure that the technological investment and investment in data you do in society is more important than ever. Hi, heidi, welcome. Hello, you can use the time and introduce yourself and what you do in the podcast.

Speaker 2:

I work as a senior data scientist. I work in the data science department. I work with a little over-ordered data strategy like how we use artificial intelligence, how we work with machine learning, how we get information from the organization, how we work with data literacy in the Norwegian data and computer, and I work pretty actively with the network and get our people into the data forums to find their way to get a deal in the data field. I have a background as a researcher. I came to the post two years ago after 16 years in the city and I managed to come out of academia in a strange way. I was very good at it and actually creating and changing things, not just talking about it. I have even become a little bit more aware of academia.

Speaker 2:

I have also been building up tech and big data. I think it was when I started it. I went out and started it a year ago. I was also an initiator for the Winnin Data Science Surslo, which is an initiative that I have had on the podcast before. I worked with my own ladies here, created the conference and waited for the good ladies. There are so many of them, so I'm a little bit happy to ask them how we find these ladies to get more intelligence from them, and it's not good to be a part of it.

Speaker 1:

The episode with Alexandra and Sherry from Winnin Data Science is in season two. It's a very good episode to talk about diversity in the data science world. It's also a very exciting topic that you have been talking about. How do these ladies find them? They are out there and they are very active not just in Norway but all over the world. There are a lot that happen and there are a lot that go well. What do you do when you are not at work or not interested in your work?

Speaker 2:

I work in a corner, as you might call it. I have been in the corner since I was little, so that's something that I use for my free time. I build an up-to-date concert in the middle of November with Rams Räckvim, with big orchestras, professors and listeners, so that's what I buy Fantastic.

Speaker 1:

I ask this question to everyone who is watching the podcast. It's very exciting that there are a lot of people playing instruments. Maybe in a while we will get a data band in Norway. I'm looking forward to that. What is your interest in data?

Speaker 2:

I would like to create value for the world. I would like to make the world better. To be honest, how can I build my own strength? How can I get closer to the world better? So I work with innovation and innovation. It's very exciting. Data science is a place where you can create value for the world, in the world, and at the same time, you can go to the lab and be creative.

Speaker 1:

We will talk about data strategy today. A topic that often comes up is where resources are used strategically. With your background in data science, you are a very open-minded person and they understand that data science has been very sexy and maybe the cool end of the data side. I come from data management, the data governance side, which can be a little more boring, and then we have the data engineering side, which is forgotten, and then end the data scientists with you and do the job to the data engineer. How do you see the whole distribution of data competence?

Speaker 2:

I like to develop in parallel with the production of the industry. One of the projects I worked on while I was in Sintef was the research of industrial projects. How can we use data analysis to make the production more effective, better and better quality? So I just do things. If you focus on the production of support for the companies, you don't just focus on how you should design the support. You have to think about the value. So you have to think about how you export raw materials, how you correct it, how you make raw aluminium that you can form to layers, and that is something that has been a bit unbound in the data field, and we actually live for the same type of value-making thinking. You have to have raw materials, you have to have the data on the spot. You have good logistics, after all, so you have solid data. You have to have a little time, and then you have analysis on the top of that, which then can be moved.

Speaker 2:

In a way, I see that there has been a change in the way the last years, even though most of them have been familiar with the Google figure, where you are going to give landscape for data science and then you have an email with a small black box in the middle, a lot of white boxes around with everything that is needed.

Speaker 2:

I have a impression that it has been a better and better understanding of the word for the purpose of the data science team. You need no data scientists, but if you say you need data engineers because what data engineers can do is not necessarily the data scientists are very good at you need a lot of development in order to then make infrastructure, to get data forward to the algorithm, but also to then deliver the models to the end user. And, as a fundamental, this is the data. You don't play a role in our good algorithm. We don't have the data in this year. So, gabberjim, gabberja, this is a big phagområde where what we all want to be created. You can also send someone from the other side that says that the data is new, but you also have to find the oil. So just data or just analysis. You don't need to do both data analysis and create the money.

Speaker 1:

Very good. I know we are a little off topic, but I have a question for you because it has been a discussion around it and it is companies in Norway that are actually ready to work with data science. Because when you work with data science you work experimentally, but most companies are set up to work on an engineer way, where you have a good answer to the result, and you don't get that with data science.

Speaker 2:

I think the answer is that we are all very young models. We would love to have a younger generation, but I think we are on the right track. We just have to take this one slide at the beginning. If we had talked together after the second year, I would have said quite a few times that we are on the way to work and now we are done. Now we can go to the factory and start working and then we will have a new Belgian hike. It is very good for people to notice, but the name goes in the name of your creation, because the fields where there are many workers who are behind the creation of such a service we can go and work and then it is a little interesting landscape on the way since the development of their fun and fun conditions. We have a job done. It is not possible to get out of there.

Speaker 1:

I had some on the podcast earlier who said that you can ask me about everything as long as you don't ask about chat. Let's talk a little about Postenbring and strategic use data in companies, and this is very exciting because Postenbring is a well-established, old company and gets digital information in a company that is, if you tell me, 376 years old. It's not easy. Where are you on?

Speaker 2:

the leg. We have been here for quite a long time about 73 years but we want to say that Postenbring has always been an innovative company and then started with post-bending. After that there was a company called Hestver Kjärre who got out of there post-tour and shit-bending on the way, but when it came to ship-bending, the first ship in Norway was a post-tour. When it was opened to a ferry, postenbring was on the first ferry. When it was opened to trains, postenbring was also in and out of the start and in the future we will have already 94 test-tours of cars. Until today, we have electric trains deliveries in large quantities in Swedish directions and in several other places in the country. So this is, to think, innovation. It's not a new difference, even though it's always been a little bit tricky.

Speaker 2:

Data science was then developed in the year 2020. With a new division in digital innovation, it further built a long-term data collection in data from the data source and the data source, which is a real environment. But, as you say, it needs to be more work-like in data science and, most of all, the year is more like a traditional report-building environment, which is important. Data science has started with the use case, which is already from the start, where it was designed in a way to create value, and then it worked in the future with the off-in-net-trim to how best possible it can drive the job of innovation. In the past the name structure was called the management, so now it's not really the digital innovation of the data science.

Speaker 2:

But it's not all technology and stuff. It's also a signal that we need more direction industrialization, to use a little bit of a concept. We need to have things in production. We need to create value. We also need to say the challenge of our own that we need to work more with research and innovation than what you might do in the typical IT environment. So that's where we are. We have more things in production already. When you get warnings in the net, when you think the package is coming, it's a machine learning algorithm with the bottom. The other big thing we have is volume prognoses. We need to implement requirements on the terminal of our open-poster building and there's a lot of money to save for a big-scale production. We need to make sure that not many people are working on digital tech, but we need to make sure that when there's a new production before Christmas and other travel periods, people end up not working for long.

Speaker 1:

It's exciting and good news cases for illustrators. The effect of data science is there. We often try to get to industrialization to put things in production. We often expect that the investment we make in data science and experimentation should lead to a more practical and effective development, but that's a bit of a opposite. Data science is based on research. Science is based on what we do with hypothesis and experimentation, which is a bit similar to a classic data source and is based on the data. Both have the development on the right track.

Speaker 2:

We started and at the beginning we had a few teams on the data science project. We have had people from the data source to get data and do it with someone from the network to secure the connection. But, as we were talking earlier, data science is a smaller and smaller part of production of the equipment around artificial intelligence or machine learning. So we see a rather simple thing as a way to address the solutions you make. It can't just be a model of making and telling you what your problem is. Then we don't create value. So there is a balance between the sharing and the sharing.

Speaker 2:

The idea is that we will have the creativity to take a really difficult phase when we start with things, but when we start thinking about production and industrialisation, we have a platform team that makes our own machine learning platforms, where we will try to get as much of the process as possible. But, as we said, we also have developers on site. So we need to have data engineers on site and at one point we have to stop the research and we have to use the research and research and data quality on the digital information we receive from our listeners. Our customers are quite curious to know what we are going to present this around the post, that the list of things we would like to have titles to look at will only be longer and longer in the future. So at one point we had to say, okay, enough of this, we will come today. Now we have to live in the arena and make the most of the things we have seen.

Speaker 1:

There was one thing that stuck with me when you said that it is a part of the income that you have to get there as a scientist, especially when you try to put things into production. And the one thing is where you meet over time, where you like the model, where you ensure that there is no data drift, that you do not miss the data on the road, where you ensure that you have good quality in the starting point and where you ensure that you have good infrastructure to build on. And here is a collaboration that is very important and very central, and it is a collaboration between IT and data. How is it in the post? How does it connect together? It works. It is a common IT-drift.

Speaker 2:

I have to admit that we have a job to do in relation to the model both us and IT. We work in for Azure and the architecture and the setup there is our machine learning engineers and they have responsibility for that, but it is not even possible to pass by that. This platform that they develop is for our team if it is to be escalated and the whole post will be data-driven. So you have to have a dedicated platform team or integrate it in their resource level, already have a good infrastructure. But, as it was in the series, there are some tensions there because it is a pretty standard way to work in IT projects where you work with Azure, where you have access to a dangerous team and that doesn't always match the required utility of the team because you have the insecure data that it brings. So it is not just about, okay, we are going to program this and that and we are done. You have to see how the data is and how things are connected.

Speaker 2:

As you said, how is the data drift? Is there anything else that can be done? So we have a job to do there, but I must say that it is not much about the way of the data science environment and the way of the data science environment. I don't know if there are any objections there, but there is a new way of working on a new thing we do, so it is not strange that we have to use a little bit of time to find out how we can do it in the best possible way. What do you mean by this science environment? There is no objection.

Speaker 1:

I have just had a discussion about this and that is my opinion. I think many of the challenges we face now with data in the societies are the reason why we build up data environments in the same way as we have built up IT environments in the 1980s. There is not much to be said about the words you have in the society. The reason why this is that we build up an internal service that should be paid for the resources that come from the organization, but you are actually part of the core organization and that is something that IT has been fighting for a long time, perhaps accepted, and often it has even more than two parts. One is an organization that keeps the lights on and the other is a part of the infrastructure and also a part of the secure application selection and service. But that is not enough if we are to take data as a strategic step in the society.

Speaker 2:

I agree with that, but I think that is one of the things that makes it exciting to work in this environment, and that there is not a certain thing that is not finished yet. This is something we have to look out for, and there is always a little frustration in that period because it is a difficult question. There is not a simple answer to that, but there is a lot of work to be done.

Speaker 1:

In the work that is done. It happens very often. I think you mentioned that and that data environment. Data scientists are set to develop their own infrastructure based on their own platform services, on their needs and perhaps with the help of IT. Do you see any potential there that it can lead to tensions, or are we now in the phase where most companies are largely on the line to have the opportunity to be part of managed services? Is there really a problem? Do we still need an internal IT infrastructure?

Speaker 2:

We are there with this sense that the way is a little bit more convenient while we are going, and I think it is completely clear that we usually experiment with and find out how things are connected and what is the best way to do things Before we show others that there is the best way to do things. It is also a lot of Maybe not the best way to solve the problem with the resources on the side, but I don't know how it is in other companies, but to get resources on the side of the department is not always easy. So you should really go to the market and have a very good value for the work you need In addition to navigating internal politics and the exciting things that have been found. It is very natural I think it is the most natural thing to be there where we are now. Who knows what is the best way to solve things?

Speaker 2:

I was also a part of a project around cloud computing and the development of tools for small and big companies. It was a research setting. What became very clear there was that it was too dramatic to think that we would have one ski and one ski that would be added to the whole office and also see how the market for the program was being developed in that kind of segment. You have to think more about ski and ski and more hybrid solutions where you choose the things you need. What you might need more from the IT side is the registration and the setting up and not only the involuntary IT security. You can see that when you put together many different tools, that both the tools themselves are secure and that all the connections between them are clearly enough to be sure that the IT security is in place.

Speaker 1:

That is a good answer to a little bit of a question. Thank you. The core of our conversation is strategic use of data. What do you think that means to bring to the post?

Speaker 2:

The core of the question is how we can use data to make better decisions, to become more effective, to do things more efficiently For your team. You can give us the four points we have in our data strategy. We have four things that we will focus on in the coming years. First, we will be able to distribute data data products. It will mean that we will not be responsible for data data, but it will be a responsibility for all of you. The second is to increase the level of self-management. How can we do it for people who work with data? What role do you want to play in the future? We will talk about that in the end user.

Speaker 2:

We have the data science team and the faculty members with a double contact on the data. That is right, where it is quickly taken over the data. How can I get the data out of the way and find out who I am talking to as often as possible? When you do not have the full control of, or not completely, self-management, we will be easier to use the data. We also have many faculty members around the post who have been specialized in simple things, for example, those who work on the terminal health system. They will have a different need and a different type of need where they actually give the data platform and data service on the platform, but they will have to make sure that they have autonomy on the way over.

Speaker 2:

You have the management environment those who do not work with data, but need data products to take decisions, such as the data storage, which is a bit of a burden for the staff. A number or a diana is a bit of a burden. They do not need long reports or insights on the data, but they need something that makes it easier for them to take the right decisions, and that would all hide. I am going to do self-examination, so I need to be on the stage. I also need to have different needs in these four groups. And then it is finally the goal to develop and implement new and improved products and services based on data. So it is one point that covers a lot and we are on that for ourselves. So it is the high level that we think about when we think about data strategy and in the post-mortem.

Speaker 1:

It is a very exciting thing. There are two things that caught my attention. One was that the first two points we talked about were established, distributed and upgraded self-examination. Those are two points that are very important for data mesh. Is data mesh in the language we are in post-mortem?

Speaker 2:

I am not sure about that and I do not know. It is not my personal opinion, so it is possible that I am stupid when I try to talk about it, but one of the challenges we have is that we are 376 years old, we have grown up and we have bought data landscape. We have the most important data systems. They will be upgraded and then we will move to the logistic engine. The engine that will be used is all the information about the condition of the front-end. We will be able to counter-pick up all the information and the front-end. It is completely up to you to pack or deliver on the door or the front-end in the mail. It will be shipped over to China. So that is one important thing for us.

Speaker 2:

We have this Brio data warehouse environment that is in the SASS. In addition to this, there are several other specialist environments those that work on customer service or those that work directly with our customers. The network is typically on data other places. So how do we collect these data chips? I think we will take self-training, but how do we get it to practice? That is something that we have to work on, especially because it is a complex landscape.

Speaker 1:

What happened is that when we talked about the different uses of the groups, I thought about a speech from Bill Immen, who said that the biggest tab we have done in the data world was Data Lake. All data is in one place and people can just say they are from the same place. I think it has been very good. It is very important to find out the different uses of the groups, because they have different uses. You cannot get all data from the same place with the same data. Some need context, some need metadata, some need to have access to raw data directly. What do you think is very important?

Speaker 2:

One of the things we are doing today is data catalog, both to get information about which data is where, who is responsible for it, what should we talk about and, not least, what context is important for the data to be collected, because it is very easy to make assumptions when you want to build data on things. It is important to be aware of why the data is collected, because you always make choices to create data and quality, and which data you collect. I think data catalog is the most important thing in this regard To make the data more accessible and to avoid the need to know who we are going to talk to about the virus before we get the data. It's a very simple and useful tool, but it's not something that's happening.

Speaker 1:

I think the data catalog is a work that has been very much discussed in the past and there are many companies that want to establish a data catalog in this form, and the important thing is that in the data catalog you can get different forms. So what I have done with the defining group is a fantastic starting point for our work towards the data catalog. Can we talk a little about competence, because the idea that the talk center is very exciting. It's connected to strategy. I see two different ways to do this. One is how we manage and incompetence in the company, from recruitment, but also increasing the competence in the company through different programs and the other is how we manage and hold the competence in the company, and especially on data science. It's data science and sport, so it's not easy to get good data scientists in the company and it's even more difficult for all good data scientists in the company. How did you think about this?

Speaker 2:

I think I have no idea. I think that the challenge of data scientists is a bit of a disadvantage, and we have had a lot of problems with getting experienced data engineers and data scientists Because there is a risk of more specific specialization in relation to how many, how big the pool is. People can recruit from there to get people, but we have been pretty busy the last few years All these things the way of getting the data and I want to say to you now that the data is very important. We have done something wrong here, but we have done the right thing. I can't help but think about it. Maybe focus on how far we have come, that we actually work specifically to get things in production. There are more and more companies that talk about big words or use artificial intelligence in the morning without thinking about doing anything practical and also that we work in a different time. I think that is a strength, so you have the ability to not be sitting and working alone. You work with people and build things in cooperation.

Speaker 1:

I think this is perhaps the core of keeping good people in a company, namely to have a good environment where you can practice and work in production, but, in a more general way, development of competence. Have you talked about data literacy programs in companies?

Speaker 2:

It is something we see that we need to be a driver for, as all the people who work in the field we would like to be, so that we sit in our area in open landscape and people come to us with questions that they have that are, of course, interesting to us and help them with that.

Speaker 2:

But we see that it is expected to do some work before we come here and we need to get the whole organization involved and with more confidence in how you can use data to do things smart.

Speaker 2:

And then we have to say, even if we are a little bit critical about the chat GPT and such, we have had a project on using chat GPT now which has seen a little bit of a momentum Because it has recruited people in the first organization who have been in the field and then use chat GPT in their daily work, Not that they should sit and play with it, but how they can use this to solve work tasks and in the field of development you see that it works pretty well.

Speaker 2:

So it is a challenge for us at the data science which the chat GPT is in the big field so it has come in as a user that it is something completely different from the other organizations, but we have things like data pictures. So we are presented with exciting things in the field of design and where it is possible for the most of the posts to be on the field of design. They have a field network for data analysis because they are a little more familiar with it, for a number of research experiences and a part of knowledge, to tell you a little more about what we're doing and at a level that might not be interesting for a large audience. And then we also organize what we call the COI for lunch, where we invite the next year to keep in touch with the work with the CIS, and that can vary a little, but it's typical at a level that is actually related to the COI in the series.

Speaker 1:

I like to understand the COI in the series. I always agree with the statistics and data statistics that there is no doubt about the data statistics and the dependence on the data, but we often talk about dependence on the management strategy when it is to be summarized, and that is important. But it is also dependent on an ecosystem that the company is moving in. You also want to look out on markets, on opportunities, on what else in the same range, and then at least you look at technology and architecture and understate the service structure. How has it been seen in this ecosystem?

Speaker 2:

Logically, it is a low-income industry, that is, it is not at all very easy to pay for a contract. There are not a few companies that check out our series. Now I'm going to pay a little more for getting the company to pay for me. So of course, there must be a lot of things that happen around us, and you can, for example, look at the environment in the UK, where there are research centers that regularly publish reports on where the logistics moves in terms of digitalization and analysis of the data and the technology in general. So on the other side of this, it is also like, if we are still to be competitively important, we are useful to smart, we have volume and we have the professional equipment of ours that makes us have a professional network that no one else in Norway has. But we see that volume for packages, for example, is out and we can scale up and build more terminals. The terminal is where we sort the packages so that when they come in from post-bapties or from post-bapties, it is sorted on a large running platform in larger storage rooms and sorted on post numbers or post outputs and sorted further to the country. So after all, there are more and more packages.

Speaker 2:

Yes, we can build more and more terminals and bigger and bigger terminals, but there is also a huge potential there to use data and analysis to actually do things in a smarter way. For example, more robust solutions to read the information on the packages. We have a space called the hospital, where the packages that are not consulted are sent by the machine and there is a large amount of manual treatment on the packages, because it can be sure that the packages are damaged but it can also be sure that they are not directly electronic information on the packages. Or we have a program called the network that then listened to the management of the packages and not the network that then listened to the packages. So there is a lot of things around where we can make things smarter to get more and more packages treated by the machine and things like the optimization within the terminals.

Speaker 2:

We think of the way we then pack the packages on the assembly lines. If there are 200 packages from Hensom-Einrüt, we should all type on the assembly lines at the same time, or we should spread it out and play a role in which the network is following these approaches, so our packages are sent down to the further transport. Which network do we have in mind. So now that we have jumped on and started, there is a digital connection for the largest package terminal, where we plan to work with Sintte to get good simulators and make exciting things there, especially to use the concept with digital connection to send data to the system and to do it in a similar professional way as we do with the package production.

Speaker 1:

So that's a lot of exciting stuff to wait for and also thanks a lot for a very good conversation. For the exception, do you have any takeaway or any short-reaction?

Speaker 2:

I just want to tell you about the arms. We are on the road, we are out of the way after that and we are moving on. But we are starting to do more data in Norway and I also focus on the data. I have a background algorithm for the tools which is very useful to sit and use my own PC and tweak and find out things and do things optimally. But it will be a bit limited. It will be out of the way. At the end, and central to this, we will be working on a variety of factors and involving a lot of business 90 bars a week today.

Speaker 1:

Peace and tach.

People on this episode