开发工具:
文件大小: 319kb
下载次数: 0
上传时间: 2019-07-16
详细说明:大数据下的机器学习算法综述,介绍利用大数据做机器学习的常用算法ordan
Little bootstraps
Boot
frap
ordan
4
4.1
4.2
Kol-
Tucker
Memory -Efficient Tucker Decomposition MET
MET
densed Nearest Neighbor CNN
R
duced nearest neighbor RnN
Ed
MET
ted Nearest Neighbor ENN
Wahba
h
10
Regularized
CNN
Kernel estimation RKE
Robust
Fast Cnn FCnn
Manifold Unfolding RMU
ordan
Boot strap
Ie
21994-2016ChinaAcademicJournalElectronicPublishingHouse.Allrightsreservedhttp://www.cnki.net
Self-organizing Map SOM
SOM
16
Fast SoM fSom
1g
sⅤD、 RP PCA
Fuzzy Lower-
pproximation -Based Fuzzy Rough Set Feature Se
lection with Threshold T-FRFS
Quickreduct
L - FRFS
Quevedo
VM
Simulated annealing and Genetic algo-
hm saga
NP
askov
SVM
Pal
SVM
Minimax probability Machine
ⅥPM
M
Incremental Kernel pca
Sun
Least Squares SVM Ls-svM
Kim
21994-2016ChinaAcademicJournalElectronicPublishingHouse.Allrightsreservedhttp://www.cnki.net
C Q-training Based
Decision tree
on random forest Co -forest
franco -Arcega
n
Yang
Parallel Averaging
Ir
ncre
Stochastic gradient Descent AsgD
120
mentally Optimized Very Fast Decision Tree
、l000
iOVFDT
Information bot
Benaim
Extreme Learning
Machine elm
lg
ELM
上CM
ELM
Havens
ELMI
ELM
ELM
FCM
ELM
ELM
FCM Random Sampling Plus Extension FCM
ELM
FCM Bit-Reduced FCm
M A
ELM
proximate Kernel FCM
FCM
Havens
FCM
k
4
21994-2016ChinaAcademicJournalElectronicPublishingHouse.Allrightsreservedhttp://www.cnki.net
1.I|4TB10
LO
I/0
Hall
TB
redu
MapReduce
4.5
A1
priori
Zhao
MapReduce Apriori
Means
speedup、 SIzeup、 scaleup3
Papadimitriou
prioraL AprioriSome Dynamic Some &i
Maple
Generalized Sequential Pattern
clustering
GSP
Secuen
Distributed Co-clustering DisCo
tial Pattern Discovery Using Equivalence Classes
Hadoop
SPADE S
DisCo
GB
g
Frequent Pattern-Projected Sequential Pattern
Mapreduce
Ⅵ lining FreeSpan50
KNN
Prefix Projected Sequential Pattern Minir
PrefixSpan. SPADF
Ferreira
Ma
for Sequential Pattern Mining MEMISP dexing
M
Reduce
LO
2
Sequential Pattern Mining with Reg
Bow best of both Worlds
ular expression Constraints SPIRIT 59
BoW
BoW
Havens
C-mean
C-mean
61
Sequential Pattern GSP
GSP
Mining Frequent Sequences MFS
sP+ Mfs
SPADE
ncrementa
quence
Mining sm
21994-2016ChinaAcademicJournalElectronicPublishingHouse.Allrightsreservedhttp://www.cnki.net
ADABOOST PL LOGITBOOST PI
ISM
63
Incremen
Ⅵ maPreduce
tal Frequent Sequences Mining IsE
ISE
Mapreduce
mentally Updating Sequences IUS
65
IseI
66
Latent
Tradeoff between performance and Difference Dirichlet Allocation LDa
TPD
Collapsed Gibbs Sampling CGs
Collapsed variational Bayesian
CVB CPU
GPU
4.6
Graphic processing Unit GPU
GPU
luo
SVM
sM、
MapReduce
Compute
60%
Unified device architecture CuDa
Mapreduce
2008
Shim mapReduce
PDMiner parallel distributed miner
apReduce
MapReduce
MapReduce
436076-8
g
Generalized Linear Aggregates Distributed En
Hefeeda
gine GLADE. GLADE
User-
Defined Aggregate UDA
GLADE
UDA
Post
P
2
21994-2016ChinaAcademicJournalElectronicPublishingHouse.Allrightsreservedhttp://www.cnki.net
8 Brighton H Mellish C. Advances in Instance Selection for Instance
Based Learning Algorithms. Data Mining and Knowledge
STOVe
9 Li Yh Maguire L. Selecting Critical Patterns Based on Local geo-
metrical and Statistical Information. IEEE Trans on Pattern Analy
and Machine Intelligence 20|1 33 6 II
10 Angiulli F. Fast Nearesl Neighbor Condensation for L arge: Dala Sels
Classification. IEEE 'Trans on Knowledge and Data Engineerin
200719111450-1464
11 Angiulli F Folino G. Distributed Nearest Neighbor-Based Conden-
sation of very Large Dala Sets. IEEE Trans on Know ledge and Data
Engineering200719121593-1606
12 Jordan M I. Divide-and-Conquer and Statistical Inference for Bi
Data / Proc of the 18th ACM SIGKDD International Conference on
Knowledge Discovery and Dal a Mining. Reijing Chinad 2012
DOIl0.ll45/2339530.2339534
13 Kolda TG Sun J M. Scalable Tensor Decompositions for Multi-as-
pect Data Mining// Proc of the &th IEEE Intemational Conference
on Data Mining. Pisa Italy 2008 363-372
3 Hadoop、CLDA
14 Wahba G. Dissimilarity Dala in Statistical Model Building and Ma-
/ Proc of the 5 th
Mathematicians. Beijing China 2012 785-809
15 Loi C iI Wang i L Zhao p l et al. Online Feature Selection for
Mining big Data / Proc of the let International Workshop on B
Dala Strearns and Heterogeneous Source Mining Algorithms Sys
tems I
aming Models and Applications. Be
201293-100
16 Sagheer A Tsuruta n Taniguchi R I et al. Fast Feature Extrac
tion Approach for Multi-dimension Feature Space Problems /i Prod
of the 1 8th International Conference on Pattern Recognition. Hong
China2006I417-44
17 Anaraki J R Eftekhari M. Improving Fuzzy - Rough Quick Reduct
1 Labrinidis a Jagadish II V. Challenges and Opportunities with Big
for Feature seler lion / Proe of the 191h Tranian Conference on
Data. Proc of the vLDB Endowment 2012 5 12 2032-2033
Electrical Engineering. Tehran Iran 2011 1-6
2 Bizer C bonez p Brodie m l et al. The meaningful lse of big
18 Quevedo J R Bahamonde A Luaces O. A Simple and Eficient
Data Four Perspectives- Four Challenges. ACM SIGMOD Record
Method for Variable Ranking according to Their Usefulness for
201240456
3 Li G. Cheng X Q. Research Status and Scientific Thinking of Big
1578-595
Data. Bulletin of Chinese Academy of Sciences 2012 27 6 647
9 Gheyas I A Smith L S. Feature Subset Sele
657 in Ch
Inest
signality Domains. Pattern Recognition 2010 43 1 5-13
0 Pal M Fory G M. Fe
ral Data bv svM. ieee Trans on Geoscience and remote
2012276647-657
Sensing20104852297-2307
4 Wang F Y. A Big-Data Perspeetive on Al Newton Merton and An-
21 Sun Y Todorovic s Goodison S. Local-earning Based Feature
lytics Intelligence. IEEE Intelligent Systems 2012 27 5 2-4
Selection for High-Dimensional Data Analysis. IEEE Trans on Pat-
5 Simon H A. Why Should Machines Learn / Michalski R S Car
term Analysis and M: chine Intelligence 2010 32 9 1610
bonell J G Mitchell Tm et al. eds. Machine Learning An Arti-
26
ficial Intelligence Approac
h. Berlin Germany Springer 1983
22 Hua J P Tembe W d Dougherty E R. Performance of Feature-Se
25-37
lection Methods in the Classification of High-Dimension Data. Pat-
6 Hart P. The Condensed Nearest Neighbor Rule. IEEE Trans on In
tern Recognition 2009 42
formation Theory 1968 14 3 515-516
23 Song M Yang H Siadat S H e/ al. A Comparalive StudIy of Di-
7 Gates G. The Reduced Nearest Neighbor Rule. IEEE Trans on In-
mensionality Reduction 'Techniques to Enhance 'Trace Clustering
formation Theorv 1972 18 3 431-433
Performances. Expert Systems with Applications 2013 40 9
21994-2016ChinaAcademicJournalElectronicPublishingHouse.Allrightsreservedhttp://www.cnki.net
335
3722-3737
rithms for Very Large Data. IEEE Trans on Fuzzy Systems 2012
24 Lau K W Wu Q IL. Online Training of Support Vector Classifier
2061130-1146
Pattern Recognition 2003 368 1913-1920
41 Xue Z H Shen G Li J H et al. Compression-Aware 1/0 Per
25 Laskov P Gehl C Kruger S et al. Ineremental Support Vector
formance Analysis for Big Data Clustering / Proc of the lst Inter
Learning Analysis Implementation and Applications. Journal of
national Workshop on Big Data Streams and Heterogeneous Source
Machine learning research 2006 7 1909-1936
Mining Algorithms Systems Programming Models and Applica-
26 Huang K Yang H King I. ef ml. Maxi-Min Margin Machine
liors. Reijing China 2012 45-52
Learning Large Margin Classifiers locally and Globally. IEEE
42 Hall L O. Exploring Big Data with Scalable Soft Clustering //Proc
Trans on Neural Networks 2008 19 2 260-272
of the 6th International Conference on Soft Methods in Probability
27 Kim BJ. A Classifier for Big Data / Proc of the 6th International
and Statistics. Konstanz Germany 2012 11-15
Conference on Convergence and Hybrid Information Technology
43 Zhao Wz Ma H F He Q. Parallel k-means Clustering Based on
Daejeon Republic of Korea 2012 505-512
MapReduce / Proe of the I st International Conferenee on Cloud
8 Franco-Arcega A Carrasco -Ochoa J A Sanchez-Diaz
omputing and Big Data. Beijing China 2009 674-679
uilding Fast Decision Trees from Large Training Sets. Intelligent 44 Papadimitriou s Sun J M. DisCo Distributed Co-elustering with
Dala Analysis2012164649-64
MapReduce A Case Study towards Petabyte-Scale End-+o-End
29 Hang Y Fong S. Incrementally Optimized Decision Tree for Noisy
Mining// proc of the 8th IEEE International Conference on Data
Big Data / Proc of the lst International Workshop on Big Data
Mining. Pisa Italy 2008 512-521
Streams and Ileterogeneous Source Mining Algorithms Systems
45 Zhang Li F F effrey J. Efficient Parallel hN\ oins for Large
Programming Models and Applications. Beijing China 20
Data in MapReduce //Proe of the 15th International Conference on
30 Ben-Haim Y ToIm-Tov E. A Streaming Parallel Decision Tree Aly(-
Extending Database Technology. Berlin Germany 2012 38-49
rithm. Journal of Machine Learning Rescarch 2010 11 849-872
Ferreira C RL Junior ' TC Traina A J M et aL. Clustering Very
31 Huang G B Zhu Q Y Siew C K. Extreme Leaning Machine
Large Multi-dimensional Datasets with MapReduce / Proc of the
Theory and Applications. N
computing 2006 70 112
17th ACM SiGKdd International Conference on Knowledge discov
489-50
ery and Data Mining. San Diego USA 2011 690-698
Ensemble Based Extreme Learning Machine.
47 Havens T C Chilla R Jain A K et(/. Speedup of Fuzzy and Pus-
IEEE Signal Processing Letters 2010 17 8 754-757
sibilistic Kermel c-means for Large Scale Clustering /! Proc of the
33 Heq Shang T F Zhuang F Z et al. Parallel Extreme Learning
IEEE International Conference on Fuzzy Systems. Taipei China
Machine for Regression Based on MapReduce. Neurocomputin
2011463-470
201310252-58
48 Niu D L Dy J C Jordan M I. Dimensionality Reduction for Spec-
34 Zhang R Lan Y Huang G B et al. Universal Approximation of
Irl Cluslering // PrIx of the 141h Inlemalinnal Conferenc e on Arlifi
Extreme Learning Machine with Adaptive Growth of Hidden Nodes
cial Intelligence and Statistics. Fort Lauderdale USA 2011 552
IEEE Trans on Neural Networks and Learning Systems 2012 23
560
2365-371
19 Kriegel H P Kroger P Zimek A. Clustering High-Dimensional
35 Rong H J Huang g b Sundararajan n et al. Online Sequential
Data A Survey on Subspace Clustering Pattern-Based Clustering
Fuzzy Extreme Leaming Machine for Function Approximation and
and Correlation Clustering. ACM Trans on Knowledge Discover
Classification Problems. IEEE Trans on Systems Man and Cyber-
from Data 2009 31 1-58
netics20093941067-1072
50 Vidal R. Subspace Clustering. IEEE Trans on Signal Processing
36 Yany Y M Wang XN Yuan X F. Bidirectional Extreme Learning
201128252-68
Machine for Regression Problem and Its Learning Effectiveness
1 Zhou Y Cheng H YuJX. Graph Clustering Based on Structural/
IEEE Trans on Neural Networks and Learning Systems 2012 23
Attribute similarities. Proc of the vldb endowment 2009 2
1718-729
37 Li M Zhou Z H. Improve Computer-Aided Diagnosis with Machine
62 Agrawal R Srikant R. Fast Algorithms for Mining Association
Learning Techniques Using Undiagnosed Samples. IEEE Trans on
Rules in L arge Databases //Proc of the 20th Internialionial ConIfer-
Systems Man and Cybernetics 2007 37 6 1088-1098
ence on Very Large Data Bases. Santiago de Chile Chile 1994
38 Lin Yq Li F J Zhu S H et al. Large-Scale Image Classifica
487-499
lion Fasl Fealure Extraction and SVM Training// Proc of the
53 Agrawal R Srikant R. Mining Sequential Patterns // Proc of the
EE Conference on Computer Vision and Pattern Recognition
I l th Intemational Conference on Data Engineering. Taipei China
idence usa 2011 1689-1696
19953-14
39 LingX Xue GR Dai w Y et al. Can Chinese Web Pages Be 54 Srikanth R Agrawal R. Mining Sequential Patterns Generaliza-
Classified with English Data Source / Proc of the 17th Interna-
tions and Performance Improvements //Proc of the 5th Internation-
tional Conference on World Wide Weh. Beijing China 2008
al Conferenee on Extending DalalHse Technology Advances in Da-
abase Technology. Avignon France 1996 3-17
40 Havens TC Bezdek J C Leckie C et al. Fuzzy c-means Algo-
5Zaki M J. SPADE An Efficient Algorithm for Mining Frequent Se-
21994-2016ChinaAcademicJournalElectronicPublishingHouse.Allrightsreservedhttp://www.cnki.net
336
quences. Machine Learning 2001 42 1/2 31-60
VLDB Endowment 2012 5 12 2016-2017
56 Ilan J W Kamber M Pei J. Data Mining Concepts and Tech-
70 Zhang jb liTR Pan Y. Parallel Rough Set Based Knowledge
niques. 2nd Edition. New York USA Morgan Kaufmann 2006
Acquisition Using MapReduce from Big Data// Proc of the Ist In
7 Pei j Han jw PinTo H et al. Prefixspan Mining Sequential
ternational Work shop on Big Dala Streams anl Heleroyeneols
Patterns Efficiently by Prefix Projected Pattern Growth / Proc of
Source Mining Algorithms Systems Programming Models and
he 17th International Conference on Data Engineering. Heidel
Applications. Beijing China 2012 20-27
berg germany 2001 215-224
71 Heleeda M Gao F Abd-Almageed W. Distributed Approximate
58 Lin M Y Lee SY. Fast Discovery of Sequential Patterns by N
Spectral Clustering for Large -Seale Datasets//Proe of the 21 st In
ry Indexing /i Proe of the 4th International Conference on Data
ternational ACM Symposium on Iligh-Performance Parallel and Dis-
Warehousing and Knowledge Discovery. Aix-en-Provence France
tributed Computing. Delft the Netherlands 2012 223-234
2002150-160
72 PHlil I Reddy C K. Scalable anl Parallel Roosting with Map Re-
59 Garofalakis M N Rastogi R Shim K. Spirit Sequential Pattern
duce. IEEE Trans on Knowledge and Data Engineer
ring201224
Mining with Regular Expression Constraints / Proc of the 2.5th In
101904-1916
ternational Conference on Very Large Data Bases. Edinburgh
73 Kaiser C Pozdnoukhov A. Enabling Real-lime City Sensing with
Scotland1999223-234
Kernel Stream Oracles and MapReduce. Pervasive and Mobile
60 Li\ Zeng L He q et al. Parallel Implementation of Apriori Al-
Computing 2013 95 708-721
gorithm Based on MapReduce //Proc of the 13th ACIs Interna-
74 Yan F XuNY QiY. Parallel Inference for Latent Dirichlet Allo-
lienal Conference on Software Engineering Arlificia! Intelligence
cat ion on Graphie s Processing Units / Pror of the 22nd Annual
Networking and Parallel/ Distributed Computing. Kyoto Japan
Conference on Neural Information Processing Systems. Whistler
2012191-200
Canada20092134-2142
61 Zhang M H Kao B Cheung DW et al. Efficient Algorithms for
75 Jung G Gnanasambandam N Mukherjee I'. Synchronous Parallel
Incremental Updale of Frequent Sequenees / Proc of the 6th Pa-
Processing of Big-Data Analytics Services to Optimize Performance
cific-Asia Conference on Knowledge Discovery and Data Mining
in Federated Clouds // Proc of the 5th Ieee International Confer
hina2002186-19
ence on Cloud Computing. Hawaii USA 2012 811-818
62 Parthasarathy s Zaki M I Ogihara m et al. Incremental and In- 76 He Q Zhuang FZ li Ic et al. Parallel Implementation of clas
teractive Sequence Mining // Proe of the &th International Confe
ification Algorithms Based on Map reduce // Proc of the 5 th In
ence on Information and Knowledye Manayement. Kansas City
national Conference on Rough Sel and Knowlelge Technology. Rei-
USA1999251-258
jing China 2010 655-662
63 Masseglia F Poncelet P Teisseire M. Incremental Mining of Se- 77 He Q TanQ Ma X d et al. The High-Activity Parallel Imple
quential Patterns in Large Databases. Data Knowledge Engineer
mentation of Data Preprocessing Based on Map Reduce ! Proc of
ng200346197-121
the 5th International Conference on Rough Set and Knowledge
64 Zheng Qg Xu k Masl et al. The Algorithms of Updating Se-
Technology. Beijing China 2010 646-654
quentialPatternsEb/oL.2013-05-20.http://arxiv.org/
78 He Q Wang Q Du CY et al. A Parallel Hyper-Surface Classifi
ftp /cs/papers /0203/0203027 pdf
er for Iligh Dimensional Data// Proc of the 3rd International Sym-
65 Wang C Y Hong T P Tseng S S. Maintenance of Sequential Pa
posium on Know ledge Acquisition and Modeling. Wuhan China
erns for record deletion / Proc of the ieee international confer-
2010338-343
ence on Data Mining. San Jose USA 2001 536-541
79 He Q Ma Y L Wang Q et al. Parallel Outlier Detection Using
66 Zheng Q G Xu K Ma S L. When to Update the Sequential
KD-Tree Based on MapReduce / Proc of the 3rd International
Patterns of Stream Data / Proc of the 7 th pacific -sia Confer-
Conference on Cloud Computing T'echnology and Science. Athens
ence on Knowledge Discovery and Data Mining. Seoul Republic
of Korea2003545-550
80 lle Q Wang Q Zhuang FZ et al. Parallel ClaraNs clustering
67 Upadhyaya S R. Parallel Approaches to Machine Learning-A
Based on MapReduce. Energy Procedia 2011 13 3269-3279
Comprehensive Survey. Journal of Parallel and Distributed Compu-
81 Tan Q lle Q Shi Z Z. Parallel Max - Min Ant System Using Ma-
ting2013733284-292
pReduce //Proc of the 3rd International Conference on Swarm In-
68 L no DJ Ding C Huang H. parallelization with Multiplicative Al-
lelligenc e. Shenzhen China 2012 182-189
corithms for big Data Mining// Proc of the 12th IEEE International
82 Cheng Y Qin C J Rusu F. GLADE Big Data Analytics Made
Conferenee on Data Mining. Brussels Belgium 2012 489-498
Easy!/ Proc of the ACM SIGMOD International Conference on
69 Shim K. Map Reduce Algorithms for Big Data Analysis. Proc of the
Management of Data. Scottsdale USA 2012 697-700
21994-2016ChinaAcademicJournalElectronicPublishingHouse.Allrightsreservedhttp://www.cnki.net
(系统自动生成,下载前可以参看下载内容)
下载文件列表
相关说明
- 本站资源为会员上传分享交流与学习,如有侵犯您的权益,请联系我们删除.
- 本站是交换下载平台,提供交流渠道,下载内容来自于网络,除下载问题外,其它问题请自行百度。
- 本站已设置防盗链,请勿用迅雷、QQ旋风等多线程下载软件下载资源,下载后用WinRAR最新版进行解压.
- 如果您发现内容无法下载,请稍后再次尝试;或者到消费记录里找到下载记录反馈给我们.
- 下载后发现下载的内容跟说明不相乎,请到消费记录里找到下载记录反馈给我们,经确认后退回积分.
- 如下载前有疑问,可以通过点击"提供者"的名字,查看对方的联系方式,联系对方咨询.