摘要: | 多任務學習使用來自多個任務的數據來訓練機器學習模型同時,使用共享表示來學習一系列相關任務之間的共同特徵關係。這些共享表示可提高數據效率,並可能為相關或下游學習帶來更快的學習速度任務,有助於降低深度學習在大規模數據需求和計算要求的學習門檻。多任務學習比單任務學習能夠更加準確地反映了人類的學習過程,因為跨領域集成知識是人類很自然而然的學習方式。目前的多任務學習基於深度學習在設計,但這遇到了訓練成本高的問題。需要大量的訓練才能達到滿意的程度,而很多問題找不到足夠的數據,這類問題並不能很好地解決,也因此無法被用於小規模數據任務。
深度森林是近來基於集成樹模型的深度學習框架,相較於深度神經網絡在調整超參數的過程需要花費較多的心力,深度森林在超參數的調整上要容易得多。訓練過程的效率高同時具有可擴展的設計也是深度森林的特色。另外,深度森林在訓練數據規模較小的情況下也能做運算,同時具備很有兢爭力的預測結果。作為一種以樹模型為基礎的演算法,深度森林較深度神經網絡能有更好的解釋性。不過目前的深度森林的設計是針對單一任務作演算,無法應用在多任務的設計上。
因此,我們開發了多任務深度森林。多任務深度森林目標會聯合坎入各自任務在每一層分析完後產出的特徵,用於發現彼此的表現關係。我們採用了深度森林演算法當基礎做出展開搭配多任務架構的設計用於解決少量數據資料的問題,同時基於深度森林的設計,多任務深度森林演算法同樣能解決控制模型複雜度以減少過擬合問題。實驗說明,我們提出的方法不僅在基準測試的評價標準上勝過了深度森林,同時具備在多任務學習中,在特徵關係上找尋相似度有不錯的分辨率並且能排列出所有任務的影響性,透過多任務深度森林演算法,我們最終可以找出最佳任務組合來做預測數據的模型,透做這個多任務的組合所產生的模型,我們可以得到較好的預測數據資料。 Multi-task learning uses data from multiple tasks to train machine learning models. At the same time, it uses shared representations to learn common feature relationships between a series of related tasks. These shared representations can improve data efficiency, and may bring faster learning speed tasks for related or downstream learning, and help reduce the learning threshold of deep learning in large-scale data requirements and computing requirements. Multi-task learning can reflect the human learning process more accurately than single-task learning, because cross-domain knowledge integration is a natural way for humans to learn. The current multi-task learning is designed based on deep learning, but this encounters the problem of high training costs. It takes a lot of training to reach a satisfactory level, and many problems cannot be found with enough data. Such problems cannot be solved well, and therefore cannot be used for small-scale data tasks.
Deep forest is a recent deep learning framework based on an integrated tree model. Compared with deep neural networks that require more effort to adjust the hyperparameters, the adjustment of hyperparameters is much easier for deep forests. The high efficiency of the training process and the extensible design are also the characteristics of Deep Forest. In addition, Deep Forest can also perform calculations even when the training data is small, and at the same time it has very competitive prediction results. As an algorithm based on tree model, deep forest has better interpretability than deep neural network. However, the current design of Deep Forest is designed for a single task and cannot be applied to a multi-task design.
Therefore, we developed a multi-task deep forest. The multi-task deep forest target will jointly check the characteristics of each task after analyzing each layer to find the relationship between each other's performance. We use the deep forest algorithm as the basis to develop a design with a multi-task architecture to solve the problem of a small amount of data. At the same time, based on the design of the deep forest, the multi-task deep forest algorithm can also solve the complexity of the control model to reduce the problem. Fitting problem. Experiments show that our proposed method not only outperforms the deep forest in the evaluation criteria of the benchmark test, but also has a good resolution in finding similarity in feature relationships in multi-task learning and can arrange the impact of all tasks. Through the multi-task deep forest algorithm, we can finally find the best combination of tasks to model the prediction data. Through the model generated by this multi-task combination, we can get better prediction data. |