
Understanding ATO Systems in Railway Control
Learn about ATO systems in railway control, including how they work, their tasks, and considerations during development. Explore a motivational task for optimizing energy consumption in train operations.
Uploaded on | 0 Views
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Optimal control systems for rolling stocks. Author: Ji N b lek This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
What is an ATO system? ATO = Automatic Train Operator Train/tram control system Control is by specifying the relative thrust (i.e. force) or acceleration Basic ATO tasks: 1. Not to exceed the permitted speed on any section of track. 2. Observe the timetable. 3. Minimise the energy consumed for the journey. Important issues to keep in mind during development: speed of calculation, inaccuracies in position/velocity measurements This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Motivational task or does the ATO make sense? Assignment: Let s have a track on flat terrain 5.4 km long between two stops. The journey time is scheduler in the timetable for 270 s. The speed limit on the entire track is 100km/h Target: Calculate the nergy consumed by the train for an optimal and more realistic driving scenario. Compare the difference and determine the percentage energy savings. This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Motivational task optimal solution Accelerate to 80 km/h and maintain a speed of 80 km/h. Go 3 km with zero acceleration (drag forces will slow the train down). Towards the end of the track start braking with maximum force to stop exactly at the end of the track This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Motivational task real scenario Accelerate to 70 km/h instead of 80. Later, when we find we're behind schedule, we accelerate to 82 km/h. We're going 3.5 km with zero acceleration (drag forces will slow the train down). At the end we brake to a stop, travel time: 270 s (on schedule). This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Motivational task summary Same journey time Comparison of consumption: Optimal solution: 60.7 MJ Realistic scenario: 64.6 MJ Approximately 6.25% more mechanical energy was used than in the optimal driving scenario. This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Motivational task conclusion To save energy, the train must accelerate to exactly the target speed - this is the most important factor. Failure to reach the target speed of 80 km/h by 2 km/h results in an energy loss of at least 5% just due to the difference in kinetic energy. The electricity costs for the operation of the Prague metro are in the billions annually (before the war at UA). under optimal control, we achieve savings of an estimated 62.5 million per year when optimised This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Why reinforcement learning? We do not assume obstacles in the train path we can plan ATO actions in advance = deterministic optimization, calculus of variations, dynamic programming. Random obstacles often appear in tram s path pre-planned actions may not be optimal given the current situation. Reinforcement learning is designed to make dynamic decisions about actions based on state and can work in a stochastic environment. Reinforcement learning intuitively better fits the idea of human control in the city. This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Environment model Track parameters (environment): ? length of the track ? time to complete the track ? number of segments ?? segment length (discretization) Differential equation characterizing the next state of the tram: 2 2+ 2???? 2? ??+??+1 ??+1 ??+1= ??+ = ?? This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Agent model Tram parameters (agent): ? weight ???? maximum acceleration Action space: ?? (0;1) (????,?;????) The minimum acceleration ????is given by the differential equations: ????,?= max{ ????, ?? 2 2??} This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Subtask 1 Assignment: Let's have a track without any obstacles in flat terrain with a length of 200 ? between two stops. The travel time is scheduled in the timetable for 25 ?. The speed limit on the whole track is 40 ??/ except for the last 10 ? where the speed limit is 20 ??/ . Target: Use reinforcement learning to find a driving strategy. This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Reinforcement learning Subtask 1 Test formulation: Environmental states: ? = (??,??,??), pro k ? Achievable tram acceleration: ?? ( 2;2) Track parameter: (?,?,?,??) Reward function: 1) Maintain maximum speed: ????= ???? max{0,?? ?} 2) Keeping time: ??= ?? |?? ?| 3) Saving energy: ??= ?? |? ?? ??| ? = ????+ ??+ ?? This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Output of subtask 1 Test parameters: ? = 200?, ? = 25?, ? = 40, ?? = 10?, We can notice that the system approximately follows the optimal strategy. The tram did not always use the maximum available acceleration in this experiment. Possible reasons: 1. Non-optimal scaling of the reward function. 2. Coarse discretization. This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Comparison with deterministic solution Stochastic vs. Deterministic This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Subtask 2 Assignment: Consider a track on level ground with a length of 200 ? between two stops. There is a crossing in the middle of the track where an obstacle appears at random times (the crossing is only specified fixed in the track). The travel time is scheduled in the timetable for 30 ?. The speed limit on the whole track is 40 ??/ except for the last 10 ?, where the speed limit is 20 ??/ . Target: Use reinforcement learning to find a driving strategy to get the control system to cross the crossing safely. This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Reinforcement learning - 2. subtask Formulace testu: Environmental states : ? = (??,??,??,??,??), pro k ? Achievable tram acceleration : ?? ( 2;2) Track parameter: (?,?,?,??,?) Reward function: 1) Safety: ?????= 2) Maintain maximum speed: ????= ???? max(0,?? ?) 3) Keeping time: ??= ?? |?? ?| 4) Saving energy: ???= ?? |? ?? ??| ? = ?????+ ????+ ??+ ?? This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Output of subtask 2 The obstacle in this task was randomly generated on the track at times from 8-12 ? The tram received information about the obstacle at a distance of 60 ? from the beginning of the track. The tram slowed down as needed and passed through the location when the obstacle had already disappeared and then accelerated again to maximum speed to keep time. This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Deterministic vs. Stochastic approach Deterministic approach: + Fast computation time Non-optimal in stochastic environment Statistical approach: + Optimal in stochastic environment Long computation time Both approaches have their advantages and disadvantages Neither approach is applicable in an ATO system, i.e. applicable in real control This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Future goals Policy iteration By combining both approaches we can achieve better results. The key is to use the already known deterministic solution to find the optimal policy. The deterministic solution satisfies 2 of the 3 goals of the ATO system. If no obstacle appears it is even the optimal solution. This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Future goals Policy iteration We will use the policy iteration algorithm and consider the deterministic solution as the default policy. The vision is that in a few iterations we achieve the optimal policy and reduce the computational time. This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Future goals - general Extension of the one-point obstacle to the interval where the points will be generated. Extension of the model to include drag forces, i.e. air resistance, gravity at different gradients of the track. Implement policy iteration to improve computing time. In case of success, possibility to test the optimization on the route of Prague tram line 7. This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group.
Conclusion Optimising the energy consumption of an ATO system makes sense. When driving a tram, it is not possible to optimally pre-calculate the route plan as it is when driving a train. A series of rules for different system states makes more sense for tram control than rules based on location alone. The use of reinforcement learning in an ATO system for trams results in more optimal decision making. This item is classified as Internal. It was created by and is in the property of the relevant company of the Skoda Group. Do not share outside of the Skoda Group. Thank you for your attention.