High Performance Data Mining: Scaling Algorithms, by Yike Guo, R.L. Grossman

By Yike Guo, R.L. Grossman

High functionality info Mining: Scaling Algorithms, purposes and Systems brings jointly in a single position very important contributions and updated examine leads to this speedy relocating quarter.
High functionality information Mining: Scaling Algorithms, purposes and Systems serves as a very good reference, delivering perception into one of the most demanding learn concerns within the box.

Show description

Read or Download High Performance Data Mining: Scaling Algorithms, Applications and Systems PDF

Best organization and data processing books

Languages and Compilers for Parallel Computing: 10th International Workshop, LCPC'97 Minneapolis, Minnesota, USA, August 7–9, 1997 Proceedings

This publication constitutes the completely refereed post-workshop court cases of the tenth foreign Workshop on Languages and Compilers for Parallel Computing, LCPC'97, held in Minneapolis, Minnesota, united states in August 1997The booklet provides 28 revised complete papers including 4 posters; all papers have been conscientiously chosen for presentation on the workshop and went via an intensive reviewing and revision section afterwards.

Cloud Computing: Web-basierte dynamische IT-Services (Informatik im Fokus) (German Edition)

Als Internetdienst erlaubt Cloud Computing die Bereitstellung und Nutzung von IT-Infrastruktur, Plattformen und Anwendungen. Dabei wird stets die aktuell benötigte Menge an Ressourcen zur Verfügung gestellt und abgerechnet. In dem Buch vermitteln die Autoren einen Überblick über Cloud-Computing-Architektur, ihre Anwendungen und Entwicklung.

Data Management in a Connected World: Essays Dedicated to Hartmut Wedekind on the Occasion of His 70th Birthday

Facts administration platforms play the main an important position in construction huge program s- tems. in view that glossy purposes aren't any longer unmarried monolithic software program blocks yet hugely versatile and configurable collections of cooperative companies, the knowledge mana- ment layer additionally has to evolve to those new standards.

Additional info for High Performance Data Mining: Scaling Algorithms, Applications and Systems

Sample text

E. on databases of significantly more than just a few thousand objects. Ester et al. (1996) present the density-based clustering algorithm DBSCAN.  0) has to contain at least a minimum number of points 0LQ3WV > 0). DBSCAN meets the above requirements in the following sense: first, DBSCAN requires only two input parameters (SV0LQ3WV and supports the user in determining an appropriate value for it. Second, it discovers clusters of arbitrary shape and can distinguish noise. Third, using spatial access methods, DBSCAN is efficient even for very large spatial databases.

The space constraint 6(SV and 0LQ3WV if 1. q ∈ S, 2. p ∈ 1 (SV T and 3. &DUG 1(SV T ≥ 0LQ3WV (core point condition). t. the space constraint 6 is equivalent to being directly densityreachable. t. t. the space constraint 7 (SV and 0LQ3WV Obviously, this direct density-reachability is symmetric for pairs of core points. In general, however, it is not symmetric if one core point and one border point are involved. Figure 8 illustrates the definition and also shows the asymmetric case. t. t. t.

We use the ‘sharednothing’ architecture which has the main advantage that it can be scaled up to hundreds and probably thousands of computers. As a data structure, we introduce the dR*-tree, a distributed spatial index structure. The main program of PDBSCAN, the master, starts a clustering slave on each available computer in the network and distributes the whole data set onto the slaves, Every slave clusters only its local data. The replicated index provides an efficient access of data, and the interference between computers is also minimized through the local access of the data.

Download PDF sample

Rated 4.37 of 5 – based on 39 votes