- Acknowledgement: We expect you to make an honest effort to solve the problems individually. As we sometimes reuse problem set questions from previous years, covered by papers and web pages, we expect you NOT to copy, refer to, or look at the solutions in preparing your answers (relating to an unauthorized material is considered a violation of the honor principle). Similarly, we expect you not to google directly for answers (though you are free to google for knowledge about the topic). If you do happen to use other material, it must be acknowledged in your submission.
- Required homework submission format: You should directly write your codes and answers in attached jupyter notebook files. Pay attention to the comments and instructions to see which parts need to be changed exactly. The teaching assistant will grade your assignment mainly based on the rightness of your programming implementation and rationality of your analytical answers.
- Collaborators: If you collaborated with others on any questions, list the questions and names of collaborator. Even if you acknowledge your collaborators, your solution should be written completely using your own words.
- In PA4, you need to build a deep Q-learning network (DQN) to play CartPole game. You will see one jupyter notebook file named
DQN.ipynb
.
- You need to write your codes and answers by directly editing the file. If you have trouble to deal with jupyter notebook, ask TA for help. And you are expected to only use numpy packages to implement the algorithms.
- After finishing your assignment, you should pack all related files into one zip file. Note that
DQN.ipynb
must be included. It's better to provide your training log and curve of scores. Then submit it to the THU's web learning page.