Smooth quantile regression and distributed inference for non-randomly stored big data

  • Kangning Wang
  • , Jiaojiao Jia
  • , Kemal Polat
  • , Xiaofei Sun
  • , Adi Alhudhaif
  • , Fayadh Alenezi

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

In recent years, many distributed algorithms towards big data quantile regression have been proposed. However, they all rely on the data are stored in random manner. This is seldom in practice, and the violation of this assumption can seriously degrade their performance. Moreover, the non-smooth quantile loss brings inconvenience in both computation and theory. To solve these issues, we first propose a convex and smooth quantile loss, which converges to the quantile loss uniformly. Then a novel pilot sample surrogate smooth quantile loss is constructed, which can realize communication-efficient distributed quantile regression, and overcomes the non-randomly distributed nature of big data. In theory, the estimation consistency and asymptotic normality of the resulting distributed estimator are established. The theoretical results guarantee that the new method is adaptive to the situation where the data are stored in any arbitrary way, and can work well just as all the data were pooled on a single machine. Numerical experiments on both synthetic and real data verify the good performance of the new method.

Original languageEnglish
Article number119418
JournalExpert Systems with Applications
Volume215
DOIs
StatePublished - 1 Apr 2023

Keywords

  • Big data
  • Communication efficiency
  • Distributed algorithm
  • Quantile regression

Fingerprint

Dive into the research topics of 'Smooth quantile regression and distributed inference for non-randomly stored big data'. Together they form a unique fingerprint.

Cite this