Skip to content
2000
Volume 10, Issue 4
  • ISSN: 2213-2759
  • E-ISSN: 1874-4796

Abstract

Background: Due to the large network scale, nowadays, it is hard to get extensive data from online social networks (OSN). Moreover, a large number of social nodes and links have made network data analysis a time-consuming task. Therefore, to sample the large-scale online social networks and restore the topological properties of original network become a problem. The purpose of this paper is to study an unbiased sampling method that can extract a representative sample from the social graph. Methods: We propose an improved algorithm based on MHRW, called Unbiased Delay sampling (UD algorithm). Then we compare it with some recent patents on sampling method to evaluate our method. Results: Different sample methods extract subnet with different topological properties. We find that UD can adapt to all kinds of different network connectivity. On the one hand, UD has a better degree distribution when the sample does not consider repeated nodes; on the other hand, UD algorithm can reduce the probability of reiterated nodes selected to sample and improve the ability of network discovery. Conclusion: We get the first, to the best of our knowledge, unbiased sampling method which has a good degree of distribution when the sample set does not have duplicate nodes. More specifically, we add parameter α to sampling process, and the value of α can control the repetition rate of the sample set.

Loading

Article metrics loading...

/content/journals/cseng/10.2174/2213275911666180403110851
2017-11-01
2025-09-23
Loading full text...

Full text loading...

/content/journals/cseng/10.2174/2213275911666180403110851
Loading

  • Article Type:
    Research Article
Keyword(s): degree distribution; independent sample; MHRW; Social network; twitter; unbiased sampling
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error
Please enter a valid_number test