Season 1 Episode 15: Anjukan Kathirgamanathan, GridBeyond on Reinforcement Learning in the Energy Sector

This week’s guest on the Hypercube podcast is Anjukan Kathirgamanathan, Senior Data Scientist at GridBeyond. As part of the modelling and forecasting team, Anjukan joins us to talk about machine learning (ML) techniques in the energy space.

Adam and Anjukan take a deep dive into reinforcement learning (RL), a subset of ML. RL enables an agent to learn how to take the optimal set of actions by interacting with a dynamic environment, learning in a similar way to humans. It is an area with great possibilities for the energy space, but there are a number of challenges to overcome—from educating customers to having sector-specific talent. Anjukan shares his perspective as a data scientist working on real-world RL applications.

S1E15 Anjukan Kathirgamanathan

In this episode, we covered:

  • Human oversight in AI-powered trading.
  • Data quality for ML modelling.
  • Challenges in deploying RL in practical applications.
  • Managing smaller, distributed resources with RL
  • Attracting data science talent to the industry.

The weekly Hypercube podcast sits down with leaders in the energy and utilities sectors to explore how data analytics can help businesses make smarter decisions and accelerate business growth.


[0:30] Anjukan introduces himself and gives an overview of where GridBeyond sits in the ecosystem.

[3:02] Anjukan explains why pricing is the area of forecasting where he sees the greatest value-add.

[4:31] Adam asks Anjukan to talk more about AI-based control methods and reinforcement learning.

[7:57] Anjukan digs deeper into the topic of reinforcement learning and the challenges of bringing it to market.

[10:59] Adam and Anjukan discuss how to help customers buy-in to reinforcement learning.

[14:41] Anjukan shares his advice for battery owner start-ups interested in adopting AI and reinforcement learning.

[16:57] Anjukan gives his view on the potential for reinforcement learning to be used within the energy industry.

[19:13] Adam asks Anjukan to share his perspective on the talent crunch in AI and reinforcement learning, specifically for the energy sector.

[21:30] Anjukan emphasises multi-agent reinforcement learning in the energy space as an area he thinks will be exciting over the next few years.  

[23:00] Anjukan speaks about the technologies he feels are overhyped in the energy space.


Hypercube Podcast Transcript.

Host: Adam Sroka

Guest: Anjukan Kathirgamanathan.

Intro: Welcome to the Hypercube podcast, where we explore how companies in the energy and utility sector leverage data analytics to make smarter decisions and accelerate business growth. I’m Adam Sroka, founder of Hypercube, a strategic consultancy that supports asset owner-operators, traders, route-to-market providers, and energy services companies to unlock the power of data.

If you’re interested in hearing real-world examples of how data and AI are advancing the energy sector, this is the show for you. 

Adam Sroka: Welcome back to the Hypercube podcast. Today I’m delighted to be joined by Anjukan Kathirgamanathan from Grid Beyond. Anjukan, would you like to introduce yourself? .

Anjukan Kathirgamanathan: Sure. Thank you so much for having me. It’s an honor to be here and sharing a bit about what I do. So yeah, as you introduced, my name is Anjukan Kathirgamanathan, and I’m a senior data scientist at Grid Beyond.

And I sit in the modeling and forecasting team. So I essentially help support the rollouts and operation of Renewable energy assets. And my particular focus is battery storage. So through monetizing its participation in the various energy and power markets. By building forecasters, optimizers for traders of these assets.

Adam Sroka: Excellent. Okay. So this is not an easy space, even though a lot of our audience live in it. Where in the kind of ecosystem does GridBeyond sit? What’s like GridBeyond’s target customer? Where do you add value in the chain? Because it can get quite messy, right? .

Anjukan Kathirgamanathan: Yeah. So GridBeyond at a high level is in the area of smart grid technologies and virtual powerplants.

So in essence, Grid Beyond is helping to balance supply and demand in electrical power grids, especially those with high renewable energy penetrations, which is increasingly the case in most power grids, and especially in islanded systems. So Ireland is a good example, Great Britain, Texas, Australia.

This is where as the renewables rollout continues, real challenges emerge due to their dependence on weather and the fact that demand is not going to exactly match when the sun is shining or the wind is blowing. So this creates a huge opportunity for companies like GridBeyond. And what we do, for example, with energy storage devices like batteries and virtual power plants, these are essentially flexing the demand side to help even out those differences in supply and demand.

So we combine trading expertise with the forecasters and optimizers that I work on to help decide, for example, with the battery, when it should charge and discharge. But also, what are the different markets the battery should trade on or purchase from and sell its energy and power. So we’re in the energy trading space, but we’re also on the behind the metre demand side.

Demand side response is a big part of what we do as well. Okay. So you touched on forecasting and that’s what you and your team are up to. There’s no end of things to forecast in what you’ve just talked about, like everything from economic markets, weather, like demand, supply, what do you focus most of your attention on and where do you get the most sort of value from?

So I would say a key focus is prices. So when you’re trading these assets, your profit and loss, your performance is only as good as your price forecast. So a lot of sort of unique value add is in getting the best forecasts for these different energy markets, power markets, ancillary services, and a loss of forecasting.

We can already leverage third parties who are doing a great job. For example, weather forecasting. We would tend to already use commercial providers there and use that as an input into our price forecasters. So price forecasting is where we really see the value add. Especially for new markets, that’s where most of my experience has been.

We do dabble in the demand forecasting, weather forecasting services. But generally if we want the best price forecasters, it’s really good to go with the best weather forecasters because there’s no point reinventing the wheel either. And it’s about finding our sort of unique offering. .

Adam Sroka: Yeah. It’s also a really complicated, difficult wheel as well, right?

So Am I right in thinking that you’re doing some interesting stuff around like AI based control methods and reinforcement learning? Could you expand on that a bit? .

Anjukan Kathirgamanathan: Yeah, so just maybe touching a little bit on my background. I was a mechanical engineer to start off with as a graduate, and I was in this sort of aerospace industry designing aircraft interiors.

So I realized that after a few years, I wanted to create a bit more positive impact on this planet. Considering all of the reversible damage humanity is causing this planet. That led me on to a PhD actually in this part of the world. And my PhD was a mix of engineering and computer science, looking at how we can use AI based control techniques like reinforcement learning to help humans.

The demand side and in particular buildings be a part of the smart grid and help decarbonize our energy systems. So I was applying reinforcement learning to control building energy systems to better control the comfort in the buildings, to also achieve economic savings for the cost of use of that energy.

And reinforcement learning is a particularly useful technique in this area. So more broadly, to give an introduction to reinforcement learning for any of your listeners, it’s an area of machine learning. So it’s a subset, which itself is a subset of artificial intelligence, which is such a big buzzword these days.

So RL or reinforcement learning is where an agent essentially learns to take the best or optimal set of actions through interacting with a dynamic environment. So very similar to how humans learn as well. And then the goal is to maximise a certain, what we call a reward quantity. So reinforcement learning rose to fame some years ago.

You might’ve heard of stories such as AlphaGo Zero, for example, beating humans at the game of Go, learning completely from scratch. And it’s been very successful in other games like Donkey Kong, pong, et cetera. Where it’s had a lot of success in games, it’s had less success. So it’s more novel. applying these techniques to more practical sort of engineering problems related to, for example, decarbonizing, energy use, energy trading.

So this is really where I went down with my PhD research. So I was looking at using this as a control technique for heating and cooling systems in buildings. And now in my current work at Grid Beyond, I’m looking at how we can also use these techniques for energy trading. So a building is much more complex.

In terms of the dynamics, modelling, the building a battery modelling, a battery is somewhat easier, but still the environment it operates in the markets that the battery can trade. And this is where the complexity rises and reinforcement learning is very good at building a black box. So we don’t have to worry too much about the structure of the problem, which is the problem with using more classical control techniques.

Whereas with reinforcement learning, we’re essentially using deep learning. So essentially several layers of neural networks to try to model the problem. And this allows us not to worry too much about how we mathematically model the problem. So I .

Adam Sroka: dabbled in reinforcement learning towards the end of my doctorate, which was also for a large engineering firm, but working on laser weapon systems.

So it took me a little later in my career to realise I was on the wrong path than you. I always found that, yeah, exactly like you’ve said. Really well suited to things like games that are already nicely, strongly codified by definition, but a lot of the finesse in applying them is being able to actually codify like the constraints and the reward function appropriately for the scenario.

Am I right in thinking that’s probably a large focus of the work and getting the most out of these tools in the way you’re doing? .

Anjukan Kathirgamanathan: Yeah, exactly. Modelling the reward function. So exactly what does the agent do or what is the best outcome for the agent? That’s not a trivial task and a further challenge is modelling a simulation environment where we can train these agents without compromising a real battery or real customers trading performance.

It’s really important that we have a test bed where we can train these agents first before deploying them out to real assets and real customers. So these are some of the big challenges with reinforcement learning because it’s very training dependent. It’s only as good as its training performance.

It can be quite an inefficient sample. So it needs a lot of experience or a lot of samples of data before it finally achieves any policy that you actually want the agent to do. So these are some of the challenges in the real world that we face with these agents and often. When you’re pressed with deadlines, if you’re in this industry, as there’s so much going on, so many new markets, so many new assets, so many new customers that often commercial pressures mean that there’s simple or more simple solutions that achieve 95 percent of the performance that maybe any reinforcement learning agent can achieve.

And so often that’s the easy way to go, or that’s a fast way to go. And. More advanced techniques like reinforcement learning get pushed to the side or more from a sort of research and development point of view and don’t actually get commercialised or productionized. .

Adam Sroka: So touches on a really interesting point and a pain that I went through.

When I was doing, trying to deploy RL stuff for batteries, I was trying to build some fairly light sort of entry level reinforcement learners to do exactly what you’re doing with optimised batteries. And I found that actually like the end users, the customers that owned and operated the batteries almost just didn’t want it, like they were much more comfortable.

With rules based systems that even though these were horrific, like lists of thousands of rules, they were at least able to understand them at some point. How do you bring people on the journey with you to like trusting that reinforcement learner as a solution? .

Anjukan Kathirgamanathan: Yeah, that’s a completely valid point.

And we face this as well. And often it’s about building trust with the customer, showing live performance of our assets. First, even if that’s using more simple sort of mathematical based optimization solutions. It’s very important that the customer also sees how we are as traders. So we have our own 24, 7, 365 trading desk, and they’re very much in the loop as part of the control of these assets.

So it’s about reinforcing that look, it’s not fully automated. We have traders, essentially, we call it human in the loop. And that gives them the confidence that when things do go wrong or. And as it needs to be taken offline, or if there’s a sudden market event that the trade is able to quickly step in and respond.

So essentially what we’re selling is a product where the human is or the trader with their vast experience. Is supervising the trading. And if something goes wrong, they can quickly take over, they can take control and it’s never 100 percent automated, even though that might be where we need to go, considering how fast the markets are and how dynamic things are at the moment, it’s still very much human in the loop and trader overseen.

I guess they would say. .

Adam Sroka: Okay. Yeah. No, that’s really interesting. And I think it is the right solution. Like a lot of people still want those experts there. Rightly in my belief, because reinforcement learning is not going to be able to respond to something that never happened before, or within a dataset.

I guess, how involved are you? In the plumbing then, and the quality of your data sources and your data pipelines become a real dominant factor in the performance of these things. And like, how do the team at GridBeyond deal with that? .

Anjukan Kathirgamanathan: Very much so, like it really is a case of garbage in, garbage out, right?

So we ensure first data quality, so ensuring that we have high availability of data. We are very lucky or I’m very lucky. I work with some expert data engineers to make sure that we have our own data warehouses with all the data we need and that we have easy access to it. So we go to third party data sources, bring them into our own data warehouses to ensure that We have that historical data, but of course, when you’re scraping data, real time feeds do growls or APIs change.

And this is some of the challenges we deal with on a day to day basis and keeps me busy a lot of the time. And then the simulation environments, that’s a big issue as well. So. Regulations change, markets change. And for some of the more complex markets, it’s very hard to build these simulation engines that can adequately capture all of the sort of market settlement, you know, all the rules in terms of capturing whether your trade gets accepted or not.

So we do have to make simplifying assumptions here. So we have to start simple, but of course, then in a real world setting, the performance, you will have noticed performance will deviate from What you might have seen in training just because the market operator might not have accepted a certain trade Because of so and so, whether that’s congestion or local issues on the grid.

So these are some of the challenges that we deal with and training data is often very different from what we see in deployment. .

Adam Sroka: Okay. So be interested to hear if, uh, like a young battery owner startup comes along and they’ve had loads of investment, they bought a big fancy battery and they want to build a data team, and they came to you and said, you That sounds very interesting.

We want to build our own reinforcement learning optimizer to compete with yours. Do you have any words of advice or wisdom that you’d give them? .

Anjukan Kathirgamanathan: I think the old adage of start simple is a golden one. When building up an agent, start with the simple toy example. So this is also what I did in my PhD.

There’s a lot of good examples of open source tools. Providing these environments, open AI, Jim is, is one. This lets you get familiar with these control approaches in a toy example. So it’s a very simple objective in the real world with trading. The objective is often much more complex. So you get a feel of how these agents are performing.

And then I’d also say there’s no point in reinventing the wheel. And many of the straight out of the box algorithms are a very good place to start rather than trying to code an agent from scratch. I mean, if you’re really interested in the mathematical side, it’s very fascinating. And there’s a lot of research going on.

It’s a real deep rabbit hole, I would say, but again, the commercial or open source agents that are out there, uh, very good places to start now. So I would say, yeah, start simple use out of the box algorithms. And then again, looking at the markets, it’s a real question of, and this is what we do at GroupBeyond.

It’s finding the markets with the most value that are very liquid, finding the best value for your asset. And then these simple markets are easier to model as well. So for example, if it has a simple clearing mechanism, a simple pairs clear market, it helps build the simulation environment because there’s very simple mathematical rules.

And again, as I said, there’s certain markets. Which are much more complex or even a little bit black box in terms of. Modelling, you know, whether a trade gets accepted or not. So again, maybe not the ones to start with. So starting simple is what I’m .

Adam Sroka: really saying. What do you think the future of reinforcement learning is in the energy sector?

Because it naturally feels like it’s such a complicated space. It’s so wild and varied and technical and hard that it’s actually a lovely problem for reinforcement learning to come and solve. Right. Do you foresee it ever becoming like the dominant player, like the dominant approach, or will it be constrained to like the big brain sort of PhDs like yourself?

Anjukan Kathirgamanathan: Not sure. Good question. I think it’s really dependent also on how customers or the operators of these assets feel about it as well, because at some point they’re trusting us to operate the asset. And in the case of utility scale batteries. That’s in the millions of euros or pounds. So I would say it’s very much dependent on how the operators feel.

And for them always the health or state of health of the assets, the performance of the asset is very important as well. So if we have proven track records of these techniques, I’d say optimising and managing these assets in a safe and profitable way, then I think that there’s a good commercial driver.

And I’d say with. Maybe a little while away still from having that sort of proven track record of using these techniques. But there are assets where I think reinforcement learning has a lot of potential and that’s especially in the more smaller distributed resources. So when we’re talking about electric vehicles, heat pumps, you know, if we try to build classical control techniques for all of these assets, which are going to be part of the smart grid in the future, That’s just too cumbersome.

You’re going to need too many engineers, too many mathematical models, changing market dynamics, model dynamics. I don’t see that happening. Whereas with reinforcement learning, there’s a lot of research and multi agent techniques, whether that’s coordination. So all of these agents work together or competitive approaches.

So it’s very much in the research space, I would say from what I’ve seen. But when we’re talking about thousands of assets at scale, This is a very good potential application of these multi-agent reinforcement learning solutions. So, .

Adam Sroka: interested to hear your thoughts on the talent piece. It feels like it’s a weird market right now and lots of people are looking for data roles.

It’s become. Quite interesting again, coupled with big layoffs. So there’s lots of very talented people that are available. Also energy is becoming quite attractive to people. And we’re attracting better and better people into the industry. Are you finding that the people that you’re kind of speaking to and hiring have the right foundations to understand and like to work with reinforcement learning type models and these skills or.

Is it something that’s maybe missing from university curricula or other experiences? .

Anjukan Kathirgamanathan: I would say we do face a crunch in hiring in this space. And the biggest constraint is not necessarily understanding or experience with techniques like reinforcement learning and AI, but it’s understanding of the energy markets and more of the engineering side of things.

And so this is an area we really struggled to hire and usually if someone comes to us with that background or experience in techniques, these techniques, they still need a good few months to get familiar with the energy market, all the acronyms, the different products, the engineering behind it, you know, how a power system operates, the technical constraints.

So this is one of the biggest challenges we face, but I am seeing an increasing number of people even change careers based on wanting to work in the sector, renewables based on wanting to make a societal difference. So combined with the increasing awareness of AI techniques like reinforcement learning, more people doing masters and data science and AI.

I think we’re going to be in a much better place in a few years time. .

Adam Sroka: Totally. It’s almost like an advert for Hypercube. That’s exactly the reason we started the business like that. We did that, yeah, like so many people know the technology and the tools and it’s grey and that’s getting more and more attractive, but actually for the energy sector, especially trading and batteries, it’s so complicated that it does take months to really onboard and learn all that stuff.

So. Yeah, for anyone listening that’s experiencing similar problems, you know where to find us. Last thing I wanted to ask was, do you have any like, so other technologies outside of what we’ve been talking a lot about reinforcement learning, right, anything else really get you excited for the next few years? Or do you think it’s going to take off in a big way? Like what are you learning next? What’s interesting to you in the tech space? .

Anjukan Kathirgamanathan: I think it is the multi agent. Reinforcement learning because increasingly we’re starting to look at smaller and smaller assets or where we used to focus on big factories, big demand side or load centres or big utility scale batteries.

Increasingly, the value is going to come from aggregating much smaller distributed energy resources, whether that’s smaller batteries, PVs, electric vehicles, heat pumps. And with all of these, essentially agents, we can say participating in more and more time varying prices or even access to wholesale markets.

Now you really have to think of these multi agent scenarios, where these assets either work together, coordinate, or compete. And as I was saying before, this is still very early stages. It’s in academia, but it hasn’t really come through to the commercial world. So that’s one area I would say to look out for, and I’m struggling sometimes just to keep abreast of all of the new papers in this field, but it’s a very interesting one. 

Adam Sroka: Are there anything that you think’s overhyped or any hot takes or controversial sort of outlooks on, on the data and AI industry as a whole? .

Anjukan Kathirgamanathan: I would say LLMs and things like TATCPT, it’s very interesting, but in the energy space, probably limited applications at the moment, our core focus is in rolling out, like speed is important in this decarbonization project we’re on, and often, most of the time, simple machine learning techniques, neural networks, gradient boosting, you know, these are high performers in terms of forecasting, In terms of the optimising, just using simple mathematical solvers, linear programs, or nonlinear programs, they get the job done and speed is important.

So this is an area I just, it’s, I struggled to keep abreast of the rest of AI. You could say, so I very much. Focus on what I need to get the job done. And that already keeps me very busy. .

Adam Sroka: That might be my new best friend. No, I feel similarly. I think I’m probably holding out for other people to make a lot of mistakes that I can learn from before I dive in too.

Okay. Last thing I always ask, try and ask everyone is if people want to learn more about you and your team, or is there things you want to plug? Like where should we point them to? .

Anjukan Kathirgamanathan: I am on LinkedIn and I’m happy to have a chat. We at Grid Beyond also are on LinkedIn. You can find us on www. gridbeyond. com.

If you’re interested in anything energy trading or if you own assets or any demand side, if you’re part of any INC sort of electric load, then have a chat with us. We’re probably in an energy market near you. And we can see how we can get the best monetary value for your asset or your flexibility that you can offer to the grid.

And it’s a win-win extra revenue for you and for the electric grid, helping support the rollout of decarbonized renewable energy sources. So come and have a chat. And also on the data side. If you’re interested in data science, I’d love to chat as well. .

Adam Sroka: Amazing. Well, look, it’s been a pleasure to speak to you.

Thank you very much for joining me. And yeah, look forward to catching up again in the future. .

Anjukan Kathirgamanathan: Yeah, we’ll have to talk more about your reinforcement learning experience separately as well. .

Adam Sroka: Thank you. 

Outro: And that’s it for this episode of the Hypercube podcast. Thanks for tuning in today.

If you have any questions about the topics we covered, you can reach out to us on LinkedIn or check out our website at You can also join Beyond Energy, our Slack community of data leaders from the sector. There’s a link to sign up in the episode description. We’re just getting this show off the ground, so if you like today’s episode, please leave us a rating, review, or subscribe wherever you get your podcasts.

It all really helps. See you next time.