Qiang Han

2papers

2 Papers

CLAug 27, 2020Code
Improvement of a dedicated model for open domain persona-aware dialogue generation

Qiang Han

This paper analyzes some speed and performance improvement methods of Transformer architecture in recent years, mainly its application in dedicated model training. The dedicated model studied here refers to the open domain persona-aware dialogue generation model, and the dataset is multi turn short dialogue, The total length of a single input sequence is no more than 105 tokens. Therefore, many improvements in the architecture and attention mechanism of transformer architecture for long sequence processing are not discussed in this paper. The source code of the experiments has been open sourced: https://github.com/ghosthamlet/persona

5.4NAMar 16
A deep backward regression-based scheme for high-dimensional nonlinear partial differential equations

Qiang Han, Shaolin Ji, Yunzhang Li

A deep backward regression-based (DBR) scheme for solving high-dimensional nonlinear parabolic partial differential equations is proposed. Building upon the established DBDP method (Huré et al., 2020), our algorithm introduces a reformulation of the local loss functions that are sequentially optimized via backward induction at each time step. The core of this approach involves reformulating simulated backward stochastic difference equations into their conditional expectation representations, thereby recasting a projection-based stochastic optimization problem as a deterministic function-approximation task. By explicitly incorporating conditional expectations, the DBR scheme facilitates a denoising mechanism prior to loss evaluation. This architecture substantially mitigates numerical variance, resulting in enhanced training stability and superior generalization performance. Numerical results demonstrate that the DBR scheme consistently outperforms the DBDP1 method, maintaining accuracy up to d=200 for bounded solutions (see Table 1). Notably, for complex unbounded PDEs where the DBDP1 method fails beyond d=10, the DBR scheme remains robust up to $d=20$ with relative errors under 9.7% (see Table 6}). Theoretically, we derive rigorous upper error bounds and establish half-order convergence for the proposed scheme. Extensions to variational inequalities are also provided.