行业新闻

基于深度强化学习的组合优化研究进展

[1] Papadimitriou C H, Steiglitz K. Combinatorial Optimization: Algorithms and Complexity. Mineola, New York: Dover Publications, 1998.
[2] Festa P. A brief introduction to exact, approximation, and heuristic algorithms for solving hard combinatorial optimization problems. In: Proceedings of the 16th International Conference on Transparent Optical Networks (ICTON). Graz, Austria: IEEE, 2014. 1−20
[3] Lawler E L, Wood D E. Branch-and-bound methods: A survey. Operations Research, 1966, 14(4): 699-719 doi: 10.1287/opre.14.4.699
[4] Bertsekas D P. Dynamic Programming and Optimal Control. Belmont, MA: Athena Scientific, 1995.
[5] Sniedovich M. Dynamic Programming: Foundations and Principles (Second edition). Boca Raton, FL: CRC Press, 2010.
[6] Williamson D P, Shmoys D B. The Design of Approximation Algorithms. Cambridge: Cambridge University Press, 2011.
[7] Vazirani V V. Approximation Algorithms. Berlin, Heidelberg: Springer, 2003.
[8] Hochba D S. Approximation algorithms for NP-hard problems. ACM Sigact News, 1997, 28(2): 40-52 doi: 10.1145/261342.571216
[9] Teoh E J, Tang H J, Tan K C. A columnar competitive model with simulated annealing for solving combinatorial optimization problems. In: Proceedings of the 2006 IEEE International Joint Conference on Neural Network. Vancouver, BC, Canada: IEEE, 2006. 3254?3259
[10] Van Laarhoven P J M, Aarts E H L, Lenstra J K. Job shop scheduling by simulated annealing. Operations Research, 1992, 40(1): 113-125 doi: 10.1287/opre.40.1.113
[11] Wesley Barnes J, Laguna M. Solving the multiple-machine weighted flow time problem using tabu search. IIE Transactions, 1993, 25(2): 121-128 doi: 10.1080/07408179308964284
[12] Basu S. Tabu search implementation on traveling salesman problem and its variations: A literature survey. American Journal of Operations Research, 2012, 2(2): 163-173 doi: 10.4236/ajor.2012.22019
[13] Halim A H, Ismail I. Combinatorial optimization: Comparison of heuristic algorithms in travelling salesman problem. Archives of Computational Methods in Engineering, 2019, 26(2): 367-380 doi: 10.1007/s11831-017-9247-y
[14] Rezoug A, Bader-El-Den M, Boughaci D. Guided genetic algorithm for the multidimensional knapsack problem. Memetic Computing, 2018, 10(1): 29-42 doi: 10.1007/s12293-017-0232-7
[15] Lin B, Sun X Y, Salous S. Solving travelling salesman problem with an improved hybrid genetic algorithm. Journal of Computer and Communications, 2016, 4(15): 98-106 doi: 10.4236/jcc.2016.415009
[16] Prado R S, Silva R C P, Guimaraes F G, Neto O M. Using differential evolution for combinatorial optimization: A general approach. In: Proceedings of the 2010 IEEE International Conference on Systems, Man and Cybernetics. Istanbul, Turkey: IEEE, 2010. 11?18
[17] Onwubolu G C, Davendra D. Differential Evolution: A Handbook for Global Permutation-Based Combinatorial Optimization. Vol. 175. Berlin, Heidelberg: Springer, 2009.
[18] Deng W, Xu J J, Zhao H M. An improved ant colony optimization algorithm based on hybrid strategies for scheduling problem. IEEE Access, 2019, 7: 20281-20292 doi: 10.1109/ACCESS.2019.2897580
[19] Ramadhani T, Hertono G F, Handari B D. An Ant Colony Optimization algorithm for solving the fixed destination multi-depot multiple traveling salesman problem with non-random parameters. AIP Conference Proceedings, 2017, 1862: 030123
[20] Zhong Y W, Lin J, Wang L J, Zhang H. Discrete comprehensive learning particle swarm optimization algorithm with Metropolis acceptance criterion for traveling salesman problem. Swarm and Evolutionary Computation, 2018, 42: 77-88 doi: 10.1016/j.swevo.2018.02.017
[21] Nouiri M, Bekrar A, Jemai A, Niar S, Ammari A C. An effective and distributed particle swarm optimization algorithm for flexible job-shop scheduling problem. Journal of Intelligent Manufacturing, 2018, 29(3): 603-615 doi: 10.1007/s10845-015-1039-3
[22] Lourenco H R, Martin O C, Stutzle T. Iterated local search: Framework and applications. Handbook of Metaheuristics. Springer, 2019. 129?168
[23] Grasas A, Juan A A, Lourenco H R. SimILS: A simulation-based extension of the iterated local search metaheuristic for stochastic combinatorial optimization. Journal of Simulation, 2016, 10(1): 69-77 doi: 10.1057/jos.2014.25
[24] Zhang G H, Zhang L J, Song X H, Wang Y C, Zhou C. A variable neighborhood search based genetic algorithm for flexible job shop scheduling problem. Cluster Computing, 2019, 22(5): 11561-11572
[25] Hore S, Chatterjee A, Dewanji A. Improving variable neighborhood search to solve the traveling salesman problem. Applied Soft Computing, 2018, 68: 83-91 doi: 10.1016/j.asoc.2018.03.048
[26] Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, et al. Mastering the game of go without human knowledge. Nature, 2017, 550(7676): 354-359 doi: 10.1038/nature24270
[27] Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, et al. Human-level control through deep reinforcement learning. Nature, 2015, 518(7540): 529-533 doi: 10.1038/nature14236
[28] Hopfield J J, Tank D W. ``Neural'' computation of decisions in optimization problems. Biological Cybernetics, 1985, 52(3): 141-152
[29] Smith K A. Neural networks for combinatorial optimization: A review of more than a decade of research. INFORMS Journal on Computing, 1999, 11(1): 15-34 doi: 10.1287/ijoc.11.1.15
[30] Vinyals O, Fortunato M, Jaitly N. Pointer networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press, 2015. 2692?2700
[31] Bello I, Pham H, Le Q V, Norouzi M, Bengio S. Neural combinatorial optimization with reinforcement learning. In: Proceedings of the 5th International Conference on Learning Representations (ICLR 2017). Toulon, France, 2017.
[32] Nazari M, Oroojlooy A, TakacM, Snyder L V. Reinforcement learning for solving the vehicle routing problem. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montreal, Canada: Curran Associates Inc., 2018. 9861?9871
[33] Deudon M, Cournut P, Lacoste A, Adulyasak Y, Rousseau L M. Learning heuristics for the TSP by policy gradient. In: Proceedings of the 15th International Conference on the Integration of Constraint Programming, Artificial Intelligence, and Operations Research. Delft, The Netherlands: Springer, 2018. 170?181
[34] Kool W, Van Hoof H, Welling M. Attention, learn to solve routing problems! In: Proceedings of the 7th International Conference on Learning Representations (ICLR2019). New Orleans, LA, USA, 2019.
[35] Ma Q, Ge S W, He D Y, Thaker D, Drori I. Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning. In: Proceedings of the 1st International Workshop on Deep Learning on Graphs: Methodologies and Applications. New York, NY, USA: AAAI, 2020.
[36] Li K W, Zhang T, Wang R. Deep reinforcement learning for multiobjective optimization. IEEE Transactions on Cybernetics, 2021, 51(6): 3103-3114 doi: 10.1109/TCYB.2020.2977661
[37] Dai H J, Khalil E B, Zhang Y Y, Dilkina B, Song L. Learning combinatorial optimization algorithms over graphs. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, California, USA: Curran Associates Inc., 2017. 6351?6361
[38] Manchanda S, Mittal A, Dhawan A, Medya S, Ranu S, Singh A. Learning heuristics over large graphs via deep reinforcement learning. arXiv preprint arXiv: 1903.03332, 2020
[39] Li Z W, Chen Q F, Koltun V. Combinatorial optimization with graph convolutional networks and guided tree search. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montreal, Canada: Curran Associates Inc., 2018. 537?546
[40] Nowak A, Villar S, Bandeira A S, Bruna J. A note on learning algorithms for quadratic assignment with graph neural networks. In: Proceeding of the 34th International Conference on Machine Learning (ICML). Sydney, Australia, 2017.
[41] Joshi C K, Laurent T, Bresson X. An efficient graph convolutional network technique for the travelling salesman problem. arXiv preprint arXiv: 1906.01227, 2019
[42] Helsgaun K. An Extension of the Lin-Kernighan-Helsgaun TSP Solver for Constrained Traveling Salesman and Vehicle Routing Problems, Technical Report, Roskilde University, Denmark, 2017.
[43] Perron L, Furnon V. Google's OR-Tools [Online], available: https://developersgooglecom/optimization, 2019
[44] OPTIMIZATION G. INC. Gurobi optimizer reference manual, 2015 [Online], available: http://wwwgurobicom, 2014
[45] Applegate D, Bixby R, Chvatal V, Cook W. Concorde TSP solver. 2006
[46] Bengio Y, Lodi A, Prouvost A. Machine learning for combinatorial optimization: A methodological tour d'Horizon. arXiv preprint arXiv: 1811.06128, 2020
[47] Chen X Y, Tian Y D. Learning to perform local rewriting for combinatorial optimization. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates, Inc., 2019. 6278?6289
[48] Yolcu E, Poczos B. Learning local search heuristics for boolean satisfiability. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates, Inc., 2019. 7992?8003
[49] Gao L, Chen M X, Chen Q C, Luo G Z, Zhu N Y, Liu Z X. Learn to design the heuristics for vehicle routing problem. arXiv preprint arXiv: 2002.08539, 2020
[50] Lu H, Zhang X W, Yang S. A learning-based iterative method for solving vehicle routing problems. In: Proceedings of the 8th International Conference on Learning Representations. Addis Ababa, Ethiopia, 2020.
[51] Scarselli F, Gori M, Tsoi A C, Hagenbuchner M, Monfardini G. The graph neural network model. IEEE Transactions on Neural Networks, 2009, 20(1): 61-80 doi: 10.1109/TNN.2008.2005605
[52] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, et al. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, California, USA: Curran Associates Inc., 2017. 6000?6010
[53] Hsu C H, Chang S H, Liang J H, Chou H P, Liu C H, Chang S C, et al. MONAS: Multi-objective neural architecture search using reinforcement learning. arXiv preprint arXiv: 1806.10332, 2018
[54] Mossalam H, Assael Y M, Roijers D M, Whiteson S. Multi-objective deep reinforcement learning. arXiv preprint arXiv: 1610.02707, 2016
[55] Joshi C K, Laurent T, Bresson X. On Learning paradigms for the travelling salesman problem. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates, Inc., 2019.
[56] Joshi C K, Cappart Q, Rousseau L M, Laurent T, Bresson X. Learning TSP requires rethinking generalization. arXiv preprint arXiv: 2006.07054, 2020
[57] Barrett T, Clements W, Foerster J, Lvovsky A. Exploratory combinatorial optimization with reinforcement learning. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(4): 3243-3250 doi: 10.1609/aaai.v34i04.5723
[58] Beloborodov D, Ulanov A E, Foerster J N, Whiteson S, Lvovsky A I. Reinforcement learning enhanced quantum-inspired algorithm for combinatorial optimization. arXiv preprint arXiv: 2002.04676, 2020
[59] Cappart Q, Goutierre E, Bergman D, Rousseau L M. Improving optimization bounds using machine learning: Decision diagrams meet deep reinforcement learning. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 331(1): 1443-1451
[60] Abe K, Xu Z J, Sato I, Sugiyama M. Solving NP-hard problems on graphs by reinforcement learning without domain knowledge. arXiv preprint arXiv: 1905.11623, 2020
[61] Hu H Y, Zhang X D, Yan X W, Wang L F, Xu Y H. Solving a new 3D bin packing problem with deep reinforcement learning method. arXiv preprint arXiv: 1708.05930, 2017
[62] Laterre A, Fu Y G, Jabri M K, Cohen A S, Kas D, Hajjar K, et al. Ranked reward: Enabling self-play reinforcement learning for combinatorial optimization. arXiv preprint arXiv: 1807.01672, 2018
[63] Huang J Y, Patwary M, Diamos G. Coloring big graphs with alphagozero. arXiv preprint arXiv: 1902.10162, 2019
[64] Li J L, Shi W S, Zhang N, Shen X M. Delay-aware VNF scheduling: A reinforcement learning approach with variable action set. IEEE Transactions on Cognitive Communications and Networking, 2021, 7(1): 304-318 doi: 10.1109/TCCN.2020.2988908
[65] Mijumbi R, Hasija S, Davy S, Davy A, Jennings B, Boutaba R. Topology-aware prediction of virtual network function resource requirements. IEEE Transactions on Network and Service Management, 2017, 14(1): 106-120 doi: 10.1109/TNSM.2017.2666781
[66] Mijumbi R, Hasija S, Davy S, Davy A, Jennings B, Boutaba R. A connectionist approach to dynamic resource management for virtualised network functions. In: Proceedings of the 12th Conference on Network and Service Management (CNSM). Montreal, Quebec, Canada: IEEE, 2016. 1?9
[67] Quang P T A, Hadjadj-Aoul Y, Outtagarts A. A deep reinforcement learning approach for VNF forwarding graph embedding. IEEE Transactions on Network and Service Management, 2019, 16(4): 1318-1331 doi: 10.1109/TNSM.2019.2947905
[68] Solozabal R, Ceberio J, Sanchoyerto A, Zabala L, Blanco B, Liberal F. Virtual network function placement optimization with deep reinforcement learning. IEEE Journal on Selected Areas in Communications, 2020, 38(2): 292-303 doi: 10.1109/JSAC.2019.2959183
[69] Liu Q, Han T, Moges E. EdgeSlice: Slicing wireless edge computing network with decentralized deep reinforcement learning. arXiv preprint arXiv: 2003.12911, 2020
[70] Van Huynh N, Hoang D T, Nguyen D N, Dutkiewicz E. Optimal and fast real-time resource slicing with deep dueling neural networks. IEEE Journal on Selected Areas in Communications, 2019, 37(6): 1455-1470 doi: 10.1109/JSAC.2019.2904371
[71] Mseddi A, Jaafar W, Elbiaze H, Ajib W. Intelligent resource allocation in dynamic fog computing environments. In: Proceedings of the 8th International Conference on Cloud Networking (CloudNet). Coimbra, Portugal: IEEE, 2019. 1?7
[72] Almasan P, Suarez-Varela J, Badia-Sampera A, Rusek K, Barlet-Ros P, Cabellos-Aparicio A. Deep reinforcement learning meets graph neural networks: Exploring a routing optimization use case. arXiv preprint arXiv: 1910.07421, 2020
[73] Meng X Y, Inaltekin H, Krongold B. Deep reinforcement learning-based topology optimization for self-organized wireless sensor networks. In: Proceedings of the 2019 IEEE Global Communications Conference (GLOBECOM). Waikoloa, HI, USA: IEEE, 2019. 1?6
[74] Lu J Y, Feng L Y, Yang J, Hassan M M, Alelaiwi A, Humar I. Artificial agent: The fusion of artificial intelligence and a mobile agent for energy-efficient traffic control in wireless sensor networks. Future Generation Computer Systems, 2019, 95: 45-51 doi: 10.1016/j.future.2018.12.024
[75] Zhang S, Shen W L, Zhang M, Cao X H, Cheng Y. Experience-driven wireless D2D network link scheduling: A deep learning approach. In: Proceedings of the 2019 IEEE International Conference on Communications (ICC). Shanghai, China: IEEE, 2019. 1?6
[76] Huang L, Bi S Z, Zhang Y J A. Deep reinforcement learning for online computation offloading in wireless powered mobile-edge computing networks. IEEE Transactions on Mobile Computing, 2020, 19(11): 2581-2593 doi: 10.1109/TMC.2019.2928811
[77] Wang J, Hu J, Min G Y, Zhan W H, Ni Q, Georgalas N. Computation offloading in multi-access edge computing using a deep sequential model based on reinforcement learning. IEEE Communications Magazine, 2019, 57(5): 64-69 doi: 10.1109/MCOM.2019.1800971
[78] Jiang Q M, Zhang Y, Yan J Y. Neural combinatorial optimization for energy-efficient offloading in mobile edge computing. IEEE Access, 2020, 8: 35077-35089 doi: 10.1109/ACCESS.2020.2974484
[79] Yu J J Q, Yu W, Gu J T. Online vehicle routing with neural combinatorial optimization and deep reinforcement learning. IEEE Transactions on Intelligent Transportation Systems, 2019, 20(10): 3806-3817 doi: 10.1109/TITS.2019.2909109
[80] Holler J, Vuorio R, Qin Z W, Tang X C, Jiao Y, Jin T C, et al. Deep reinforcement learning for multi-driver vehicle dispatching and repositioning problem. In: Proceedings of the 2019 IEEE International Conference on Data Mining (ICDM). Beijing, China: IEEE, 2019. 1090?1095
[81] Liang X Y, Du X S, Wang G L, Han Z. A deep reinforcement learning network for traffic light cycle control. IEEE Transactions on Vehicular Technology, 2019, 68(2): 1243-1253 doi: 10.1109/TVT.2018.2890726
[82] Chen X Y, Tian Y D. Learning to progressively plan. arXiv preprint arXiv: 1810.00337, 2018
[83] Zheng P, Zuo L L, Wang J L, Zhang J. Pointer networks for solving the permutation flow shop scheduling problem. In: Proceedings of the 48th International Conference on Computers & Industrial Engineering (CIE48). Auckland, New Zealand, 2018. 2?5
[84] Pan R Y, Dong X Y, Han S. Solving permutation flowshop problem with deep reinforcement learning. In: Proceedings of the 2020 Prognostics and Health Management Conference (PHM-Besancon). Besancon, France: IEEE, 2020. 349?353
[85] Mirhoseini A, Pham H, Le Q V, Steiner B, Larsen R, Zhou Y F, et al. Device placement optimization with reinforcement learning. arXiv preprint arXiv: 1706.04972, 2017
[86] Mirhoseini A, Goldie A, Pham H, Steiner B, Le Q V, Dean J. A hierarchical model for device placement. In: Proceedings of the 6th International Conference on Learning Representations. Vancouver, BC, Canada, 2018.
[87] Francois-Lavet V, Taralla D, Ernst D, Fonteneau R. Deep reinforcement learning solutions for energy microgrids management. In: Proceedings of the 13th European Workshop on Reinforcement Learning (EWRL 2016). Barcelona, Spain, 2016.
[88] 张自东, 邱才明, 张东霞, 徐舒玮, 贺兴. 基于深度强化学习的微电网复合储能协调控制方法. 电网技术, 2019, 43(6): 1914-1921

Zhang Zi-Dong, Qiu Cai-Ming, Zhang Dong-Xia, Xu Shu-Wei, He Xing. A coordinated control method for hybrid energy storage system in microgrid based on deep reinforcement learning. Power System Technology, 2019, 43(6): 1914-1921
[89] Valladares W, Galindo M, Gutierrez J, Wu W C, Liao K K, Liao J C, et al. Energy optimization associated with thermal comfort and indoor air control via a deep reinforcement learning algorithm. Building and Environment, 2019, 155: 105-117 doi: 10.1016/j.buildenv.2019.03.038
[90] Mocanu E, Mocanu D C, Nguyen P H, Liotta A, Webber M E, Gibescu M, et al. On-line building energy optimization using deep reinforcement learning. IEEE Transactions on Smart Grid, 2019, 10(4): 3698-3708 doi: 10.1109/TSG.2018.2834219

平台注册入口