Hiroshi Abe

LG
5papers
93citations
Novelty45%
AI Score26

5 Papers

NINov 11, 2021Code
Classification of URL bitstreams using Bag of Bytes

Keiichi Shima, Daisuke Miyamoto, Hiroshi Abe et al.

Protecting users from accessing malicious web sites is one of the important management tasks for network operators. There are many open-source and commercial products to control web sites users can access. The most traditional approach is blacklist-based filtering. This mechanism is simple but not scalable, though there are some enhanced approaches utilizing fuzzy matching technologies. Other approaches try to use machine learning (ML) techniques by extracting features from URL strings. This approach can cover a wider area of Internet web sites, but finding good features requires deep knowledge of trends of web site design. Recently, another approach using deep learning (DL) has appeared. The DL approach will help to extract features automatically by investigating a lot of existing sample data. Using this technique, we can build a flexible filtering decision module by keep teaching the neural network module about recent trends, without any specific expert knowledge of the URL domain. In this paper, we apply a mechanical approach to generate feature vectors from URL strings. We implemented our approach and tested with realistic URL access history data taken from a research organization and data from the famous archive site of phishing site information, PhishTank.com. Our approach achieved 2~3% better accuracy compared to the existing DL-based approach.

NANov 10, 2010
A Stable Explicit Scheme for Solving Inhomogeneous Constant Coefficients Differential Equation using Green's Function

Hiroshi Abe

A numerical explicit method to evaluates transient solutions of linear partial differential inhomogeneous equation with constant coefficients is proposed. A general form of the scheme for a specific linear inhomogeneous equation is shown. The method is applied to the wave equation and the diffuse equation and is investigated by simulating simple models. The numerical solutions of the proposed method show good agreement to the exact solutions. Comparing with explicit FDM, FDM shows the instability by the violation of CFL condition whereas the proposed method is always stable irrespective of any time step width.

LGSep 25, 2019
Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network

Taiji Suzuki, Hiroshi Abe, Tomoaki Nishimura

One of the biggest issues in deep learning theory is the generalization ability of networks with huge model size. The classical learning theory suggests that overparameterized models cause overfitting. However, practically used large deep models avoid overfitting, which is not well explained by the classical approaches. To resolve this issue, several attempts have been made. Among them, the compression based bound is one of the promising approaches. However, the compression based bound can be applied only to a compressed network, and it is not applicable to the non-compressed original network. In this paper, we give a unified frame-work that can convert compression based bounds to those for non-compressed original networks. The bound gives even better rate than the one for the compressed network by improving the bias term. By establishing the unified frame-work, we can obtain a data dependent generalization error bound which gives a tighter evaluation than the data independent ones.

MLAug 26, 2018
Spectral Pruning: Compressing Deep Neural Networks via Spectral Analysis and its Generalization Error

Taiji Suzuki, Hiroshi Abe, Tomoya Murata et al.

Compression techniques for deep neural network models are becoming very important for the efficient execution of high-performance deep learning systems on edge-computing devices. The concept of model compression is also important for analyzing the generalization error of deep learning, known as the compression-based error bound. However, there is still huge gap between a practically effective compression method and its rigorous background of statistical learning theory. To resolve this issue, we develop a new theoretical framework for model compression and propose a new pruning method called {\it spectral pruning} based on this framework. We define the ``degrees of freedom'' to quantify the intrinsic dimensionality of a model by using the eigenvalue distribution of the covariance matrix across the internal nodes and show that the compression ability is essentially controlled by this quantity. Moreover, we present a sharp generalization error bound of the compressed model and characterize the bias--variance tradeoff induced by the compression procedure. We apply our method to several datasets to justify our theoretical analyses and show the superiority of the the proposed method.