Xiangyu shared a instance normalization method for time-series forecasting against distribution shift which published in ICLR 2022.
Using this method the MSE of MLP for gaode's dataset(350days training data,50days test data) improved from 4.65 to 3.26( 30%+).
Next step I will implement this method in our meta-learning framewrok to see the improvements and compare the effiectiveness of our method and the normalization.[Presentation-slides:https://cloud.tsinghua.edu.cn/f/75320708a08f404e8a0b/]
* Zhiyuan
Summary:
this week, I conducted a series of experiments to compare our Soft-restricted MF Multi-task learning model performance with single loss trained ones.
This weeks experiments reveal that the multitask loss only contributes limitedly to the improvement on the both two tasks.
Moreover, the LSTM based backbone tends to predict more smoother compared to the more fluctuated data in reality.
An important observation: MSE mainly comes from several regions that has great volatility.
Future Plan:
Maybe next week we can try to use multi-source input data or time sequence analysis method to deal with it.
2022/6/30
Attendee: Zhiyuan, Xiangyu, Yang
Meeting Summary:
Transcribe the matrixed-version formulation into its Lagrange version
Use gradient descend and integer projection to iteratively update the optimization problem
Interpretation and analysis the loss result of the proposed method
Regard the greedy algorithm as the supremum while the dynamic programming and real-relaxed optimization as the infimum and compute their approximation rate to measure the outcome with the optimal result
Integer rounding the result in real-domain by the hyper parameter threshold.
Make sure the application of the Pathlet is use a top-k dictionary to recomposite new trajectory and find how much information can be kept
TO-DO:
Find the optimal threshold to integer round the real-domain solution, meaning we use matrix formulation to update result while guaranteeing the integer constraints
Find the application for the Pathlet in real-world project such as new coding for the spatial data
2022/06/08
Attendee: Zhiyuan, Yuanbo, Yang
Meeting Summary:
Zhiyuan Peng:
Briefly introduce the general implementation of the dynamic programming with specific data structure.
Exemplify one trajectory results under different λ.
Yuanbo Tang
Introduce the proposed algorithm in state machine term and analogues to the backpack problem in greedy thoughts (but need to connected to a specific math problem in order for its correctness)
Analyze and compare the result of proposed method with DP (including dictionary size and representation cost under different λ)
Use toy model to interpret the logic fallacy and disability to the optimal results
Based on finite map and sufficient trajectories, Transcribe the problem into matrix version
TO-DO:
Survey on the linear restricted minmax convex optimization and whether the max option can be replaced?
Think the optimization in matrix version, proposed optimization algorithm version. Could it be connected with Markov decision process
If our model brings in more features, we can use NN.
Connect the proposed algorithm with some math problem like multi-armed bandit
2022/05/11
Attendee: Zhiyuan, Xiangyu, Yang
Meeting Summary:
Use package ‘pyechart’ in python to visualize the cumulative record amount in different h3 hexagon with data from XIAN
Implement LSTM on the processed data to predict the future record in each 5min time piece till 5 times
Prepare the PPT for opening report
To-Do
Find the historical data construction used in literature,survey on how to jointly use long/short term data. We can use fusion or residue methods
Find the metrics to measure the prediction other than MSE, such as MAPE
For solution of the data sparsity and erratic, we can make a hybrid model.
First, we set a threshold to filter out those districts of historical average less than the threshold, and adopt a traditional statistical model
Second, we can use our multi-task DNN to predict other places.
At last, we can hybrid these two situations together
Find out the challenge and remedial algorithms
2022/03/09
Attendee: Zhiyuan, Xiangyu, Yang
Meeting Summary:
Zhiyuan: Share the LSTM+GCN overfitting problems, Yang gives some advice:
1. Do more data mining in dataset such as visualize the prediction results and see the correlation about the closeness points in the graph.
Xiangyu: Show the results about META-MLP and MLP in UberNYC. Yang gives advice:
1. Divide the task into weekdays and weekends looks like a cross-domain problem. In cross-domain problems, the pre-training method is often better than meta-learning, share the article about the cross-domain.
2. Change the backbone network such as GCN to see the results.
Yang: Communicate with the Gaode group once every two weeks.
To Do:
Zhiyuan and Yuanbo:
1. Visualize the graph model prediction data and compare it with the ground truth using the hot map.
2. Pre-training the LSTM then add the results into the GCN.
Xiangyu:
1. Construct the few shot tasks: remove some weekends data to test the Meta-MLP and Pretraining.
2. Add the baseline that uses the traditional time-series prediction method.
Share related multi-task ride-hailing prediction papers.
Assigning pre-processing on dataset and trim the into an 8-days long slide window for region-based outflow prediction
Constructing a graph based on the closeness relationship among regions
Constructing a network based on a LSTM following with a GCN for predicting, but the loss function doesn’t converge.
Discuss the training of the base model. How to construct the graph construction of GCN and how to select the centroid? Is given by dataset. The training of GCN is no convergence.
Discuss the MLP+META compare with the pretraining method, try more metrics(MAE e.t.) and the task division method