Web

Spidering Facebook public profiles with C++ and Boost

May 17, 2013 - By Stefano Tommesani

Back in 2010, security researcher Ron Bowes wrote a Ruby script that downloads information from Facebook’s user directory, a searchable index of public profile pages. The directory did not expose a user’s entire profile and only exposed information that the user has allowed Facebook to make public. Bowes got the idea of spidering the data so that he could collect statistics about the most common names.

Now, how hard can it be to write such a software using C++ instead of Ruby? As Jeremy Clarkson was not interested in answering such a question, I decided to write a quick and simple spidering software in C++, and found out that while it is easy to build, these days it is not useful at all.

The main part of this project is parsing the HTML pages containing the Facebook directory. In a directory page, we can find both links to other directory pages and links to public users’ profiles. Luckily, they are easy to distinguish using regular expressions, and the Boost C++ library supports Perl regular expression, so given the HTML source of a page, the following functions use regular expressions to extract useful links:

void FBSiteScanner::SearchForProfiles(string PageSourceCode) {     boost::regex reg("<li class=\"fbDirectoryBoxColumnItem\"><a href=\"http://([^\">]+)\"> ([^\">]+)</a></li>");     boost::smatch m;     std::string::const_iterator it = PageSourceCode.begin();     std::string::const_iterator end = PageSourceCode.end();     while (boost::regex_search(it,end,m,reg))     {         string UserURL = m[1].str();         string UserName = m[2].str();         if (UserURL.find("www.facebook.com/directory/people/") == UserURL.npos)  //< not a directory link         {             cout << "Adding user " << UserName << endl;             FBProfile *NewUser = new FBProfile(UserName, UserURL);             ProfileSerializer->Serialize(*NewUser);             delete NewUser;         }         it=m[0].second;     } }  void FBSiteScanner::SearchForIndexPages(string PageSourceCode) {     boost::regex reg("<li class=\"fbDirectoryBoxColumnItem\"><a href=\"http://www.facebook.com/directory/people/([^\">]+)\">([^<]+)</a></li>");     boost::smatch m;     std::string::const_iterator it = PageSourceCode.begin();     std::string::const_iterator end = PageSourceCode.end();     while (boost::regex_search(it,end,m,reg))     {         stringstream ss;         ss << "/directory/people/";         string URLCompletion = string(m[1]);         string URLName = string(m[2]);         ss << m[1];         string PeopleURL;         ss >> PeopleURL;         cout << "Adding URL " << PeopleURL << endl;         FBPageLink *NewPageLink = new FBPageLink(PeopleURL, PageSourceCode);         PageLinkStack->Push(NewPageLink);         it=m[0].second;     } }  void FBSiteScanner::ProcessPageSource(string PageSourceCode) {     if (PageSourceCode == "")         return;     SearchForIndexPages(PageSourceCode);     SearchForProfiles(PageSourceCode); }

The links to be parsed are held in a stack named PageLinkStack, as the tree traversal algorithm is based on depth-first search to minimize memory usage. Each spidering thread pops a link to parse from the stack, downloads the HTML page, seaches for links to other directory pages and pushes them into the stack, then searches for user profiles and stores them in the output file as XML.

void ScannerThreadFunction() {     FBSiteScanner *SiteScanner = new FBSiteScanner(PageLinkStack, ProfileSerializer);      while (true)     {         FBPageLink *LinkToScan = PageLinkStack->Pop();         if (LinkToScan == nullptr)             break;  // no links left to scan         string PageSource = LinkToScan->GetPageSource();         SiteScanner->ProcessPageSource(PageSource);         delete LinkToScan;     }      delete SiteScanner;     cout << "closing scanner thread" << endl; }

Having packed the spidering loop inside the ScannerThreadFunction, it is trivial to use Boost again to create a group of worker threads, let them loose on the Facebook directory and wait for them to finish their job.

    const int MAX_SCANNING_THREADS = 8;     boost::thread_group group;     for (int i = 0; i < MAX_SCANNING_THREADS; ++i)         group.create_thread(ScannerThreadFunction);     group.join_all();  //< wait for all scanning threads to complete

The final result of spidering is a list of Facebook user names, and a link to their profile pages, serialized as a XML file so that it can be easily processed by other apps. Serialization of a C++ object is easy using again the Boost library:

#include <string> #include <boost/serialization/nvp.hpp> // "name-value pair"  using namespace std; using namespace boost::archive;       // namespace for archives using namespace boost::serialization; // namespace for make_nvp  class FBProfile { public:     FBProfile(const string& _UserName, const string& _UserURL);     ~FBProfile(void); private:     friend class boost::serialization::access;     string UserName;     string UserURL;     template<typename Archive>void serialize(Archive& ar, const unsigned int version)     {         using boost::serialization::make_nvp;         ar & make_nvp("Name", UserName);             ar & make_nvp("URL", UserURL);     }; };

The FBProfile class defines which data elements shall be saved in the XML serialization (UserName and UserURL), while FBProfileSerializer class declares the output file stream, creates an XML output archive and serializes incoming FBProfile objects into the XML output archive, using a mutex so that only one spidering thread can write to the output file at a time.

#include <fstream>  FBProfileSerializer::FBProfileSerializer(string FileName) {     ProfileStream = new ofstream(FileName);     XMLProfileStream = new xml_oarchive(*ProfileStream); }  FBProfileSerializer::~FBProfileSerializer(void) {     ProfileStream->close();     delete XMLProfileStream;     delete ProfileStream; }  void FBProfileSerializer::Serialize(const FBProfile& Profile) {     SerializerAccess.lock();     try     {         *XMLProfileStream << make_nvp("Profile", Profile);     } catch (...) {     }     SerializerAccess.unlock(); }

Summing up, thanks to the Boost C++ library, writing a spider in a low-level language (or so they say) such as C++ proved to be an easy task, even when adding features that were missing in the original Ruby code, such as multiple threaded spiders and XML output. And… does it work? Well, sort of: Facebook is not really happy that you’re spidering their web pages harvesting users’ profiles, so after a few thousands of profiles (and after a while, just on the first request) they will send out a captcha to block robots.

What follows is the complete source code of this project. Please note that it is not meant to be production-quality code, but just a quick try at building a spider.

FBScanner.cpp

// FBScanner // by Stefano Tommesani (www.tommesani.com) 2013 // this code is release under the Code Project Open License (CPOL) http://www.codeproject.com/info/cpol10.aspx // The main points subject to the terms of the License are: // -   Source Code and Executable Files can be used in commercial applications; // -   Source Code and Executable Files can be redistributed; and // -   Source Code can be modified to create derivative works. // -   No claim of suitability, guarantee, or any warranty whatsoever is provided. The software is provided "as-is". // -   The Article(s) accompanying the Work may not be distributed or republished without the Author's consent  // FBScanner.cpp : definisce il punto di ingresso dell'applicazione console. //  #include "stdafx.h"  #include <sstream> #include <string>  #include <boost\thread.hpp>  #include "FBSiteScanner.h"  FBPageLinkStack *PageLinkStack = NULL; FBProfileSerializer *ProfileSerializer = NULL;  void ScannerThreadFunction() {     FBSiteScanner *SiteScanner = new FBSiteScanner(PageLinkStack, ProfileSerializer);      while (true)     {         FBPageLink *LinkToScan = PageLinkStack->Pop();         if (LinkToScan == nullptr)             break;  // no links left to scan         string PageSource = LinkToScan->GetPageSource();         SiteScanner->ProcessPageSource(PageSource);         delete LinkToScan;     }      delete SiteScanner;     cout << "closing scanner thread" << endl; }  void AddStartingPaths() {     cout << "add starting paths" << endl;      for (int NumChars = 26; NumChars >= 1; NumChars--)     {         stringstream ss;         ss << "/directory/people/";         ss << NumChars;         string NewLinkURL;         ss >> NewLinkURL;         FBPageLink *NewLink = new FBPageLink(NewLinkURL, "/directory/");         PageLinkStack->Push(NewLink);     }      for (char AlphaChars = 'Z'; AlphaChars >= 'A'; AlphaChars--)     {         stringstream ss;         ss << "/directory/people/";         ss << AlphaChars;         string NewLinkURL;         ss >> NewLinkURL;         FBPageLink *NewLink = new FBPageLink(NewLinkURL, "/directory/");         PageLinkStack->Push(NewLink);     } }  int _tmain(int argc, _TCHAR* argv[]) {     cout << "FBScanner by Stefano Tommesani 2013" << endl;     PageLinkStack = new FBPageLinkStack();     ProfileSerializer = new FBProfileSerializer("profiles.xml");     AddStartingPaths();      const int MAX_SCANNING_THREADS = 8;     boost::thread_group group;     for (int i = 0; i < MAX_SCANNING_THREADS; ++i)         group.create_thread(ScannerThreadFunction);     group.join_all();  //< wait for all scanning threads to complete      cout << "search completed, shutting down" << endl;     delete PageLinkStack;     delete ProfileSerializer;     return 0; }

FBSiteScanner.h

// FBScanner // by Stefano Tommesani (www.tommesani.com) 2013 // this code is release under the Code Project Open License (CPOL) http://www.codeproject.com/info/cpol10.aspx // The main points subject to the terms of the License are: // -   Source Code and Executable Files can be used in commercial applications; // -   Source Code and Executable Files can be redistributed; and // -   Source Code can be modified to create derivative works. // -   No claim of suitability, guarantee, or any warranty whatsoever is provided. The software is provided "as-is". // -   The Article(s) accompanying the Work may not be distributed or republished without the Author's consent  #pragma once  #include "FBPageLink.h" #include "FBProfile.h" #include "FBProfileSerializer.h" #include "FBPageLinkStack.h"  class FBSiteScanner { public:     FBSiteScanner(FBPageLinkStack *_PageLinkStack, FBProfileSerializer *_ProfileSerializer);     ~FBSiteScanner(void);     void ProcessPageSource(string PageSourceCode); private:     FBPageLinkStack *PageLinkStack;     FBProfileSerializer *ProfileSerializer;     void SearchForProfiles(string PageSourceCode);     void SearchForIndexPages(string PageSourceCode); };

FBSiteScanner.cpp

// FBScanner // by Stefano Tommesani (www.tommesani.com) 2013 // this code is release under the Code Project Open License (CPOL) http://www.codeproject.com/info/cpol10.aspx // The main points subject to the terms of the License are: // -   Source Code and Executable Files can be used in commercial applications; // -   Source Code and Executable Files can be redistributed; and // -   Source Code can be modified to create derivative works. // -   No claim of suitability, guarantee, or any warranty whatsoever is provided. The software is provided "as-is". // -   The Article(s) accompanying the Work may not be distributed or republished without the Author's consent  #include "FBSiteScanner.h" #include "FBProfile.h" #include "FBProfileSerializer.h"  #include <iostream> #include <string> #include <sstream> #include "boost/regex.hpp"  FBSiteScanner::FBSiteScanner(FBPageLinkStack *_PageLinkStack,  FBProfileSerializer *_ProfileSerializer) {     PageLinkStack = _PageLinkStack;     _ASSERT(PageLinkStack != nullptr);     ProfileSerializer = _ProfileSerializer;     _ASSERT(ProfileSerializer != nullptr); }  FBSiteScanner::~FBSiteScanner(void) { }  void FBSiteScanner::SearchForProfiles(string PageSourceCode) {     boost::regex reg("<li class=\"fbDirectoryBoxColumnItem\"><a href=\"http://([^\">]+)\"> ([^\">]+)</a></li>");     boost::smatch m;     std::string::const_iterator it = PageSourceCode.begin();     std::string::const_iterator end = PageSourceCode.end();     while (boost::regex_search(it,end,m,reg))     {         string UserURL = m[1].str();         string UserName = m[2].str();         if (UserURL.find("www.facebook.com/directory/people/") == UserURL.npos)  //< not a directory link         {             cout << "Adding user " << UserName << endl;             FBProfile *NewUser = new FBProfile(UserName, UserURL);             ProfileSerializer->Serialize(*NewUser);             delete NewUser;         }         it=m[0].second;     } }  void FBSiteScanner::SearchForIndexPages(string PageSourceCode) {     boost::regex reg("<li class=\"fbDirectoryBoxColumnItem\"><a href=\"http://www.facebook.com/directory/people/([^\">]+)\">([^<]+)</a></li>");     boost::smatch m;     std::string::const_iterator it = PageSourceCode.begin();     std::string::const_iterator end = PageSourceCode.end();     while (boost::regex_search(it,end,m,reg))     {         stringstream ss;         ss << "/directory/people/";         string URLCompletion = string(m[1]);         string URLName = string(m[2]);         ss << m[1];         string PeopleURL;         ss >> PeopleURL;         cout << "Adding URL " << PeopleURL << endl;         FBPageLink *NewPageLink = new FBPageLink(PeopleURL, PageSourceCode);         PageLinkStack->Push(NewPageLink);         it=m[0].second;     } }  void FBSiteScanner::ProcessPageSource(string PageSourceCode) {     if (PageSourceCode == "")         return;     SearchForIndexPages(PageSourceCode);     SearchForProfiles(PageSourceCode); }

FBPageLink.h

// FBScanner // by Stefano Tommesani (www.tommesani.com) 2013 // this code is release under the Code Project Open License (CPOL) http://www.codeproject.com/info/cpol10.aspx // The main points subject to the terms of the License are: // -   Source Code and Executable Files can be used in commercial applications; // -   Source Code and Executable Files can be redistributed; and // -   Source Code can be modified to create derivative works. // -   No claim of suitability, guarantee, or any warranty whatsoever is provided. The software is provided "as-is". // -   The Article(s) accompanying the Work may not be distributed or republished without the Author's consent  #pragma once  #include <string> using namespace std;  class FBPageLink { public:     FBPageLink(string _PageURL, string _ParentPageURL);     ~FBPageLink(void);     string GetURL() const;     string GetParentURL() const;     string GetPageSource(); protected:     string PageURL;     string ParentPageURL; };

FBPageLink.cpp

// FBScanner // by Stefano Tommesani (www.tommesani.com) 2013 // this code is release under the Code Project Open License (CPOL) http://www.codeproject.com/info/cpol10.aspx // The main points subject to the terms of the License are: // -   Source Code and Executable Files can be used in commercial applications; // -   Source Code and Executable Files can be redistributed; and // -   Source Code can be modified to create derivative works. // -   No claim of suitability, guarantee, or any warranty whatsoever is provided. The software is provided "as-is". // -   The Article(s) accompanying the Work may not be distributed or republished without the Author's consent  #include "FBPageLink.h"  FBPageLink::FBPageLink(string _PageURL, string _ParentPageURL) {     PageURL = _PageURL;     ParentPageURL = _ParentPageURL; }  FBPageLink::~FBPageLink(void) { }  string FBPageLink::GetURL() const {     return PageURL; }  string FBPageLink::GetParentURL() const {     return ParentPageURL; }  #include <windows.h>  #include <stdio.h> #include <stdlib.h>  //#define USE_WINHTTP  #ifdef USE_WINHTTP #include <winhttp.h> #pragma comment ( lib, "winhttp.lib" )  BOOL CheckStatus(HINTERNET &hRequest, DWORD &dwHTTPStatusCode) {     // 1. check status code     WCHAR* lpOutStatusCode      = NULL;     DWORD  dwSizeStatusCode     = 0;     WinHttpQueryHeaders(hRequest,                         WINHTTP_QUERY_STATUS_CODE,  //WINHTTP_QUERY_STATUS_TEXT or WINHTTP_QUERY_RAW_HEADERS_CRLF                         WINHTTP_HEADER_NAME_BY_INDEX,                         NULL,                          &dwSizeStatusCode,                         WINHTTP_NO_HEADER_INDEX);      // Allocate memory for the buffer.     if(GetLastError() == ERROR_INSUFFICIENT_BUFFER)     {         lpOutStatusCode = new WCHAR[dwSizeStatusCode/sizeof(WCHAR)];          // Now, use HttpQueryInfo to retrieve the header.         if(!WinHttpQueryHeaders(hRequest,                                  WINHTTP_QUERY_STATUS_CODE,                                 WINHTTP_HEADER_NAME_BY_INDEX,                                  lpOutStatusCode,                                  &dwSizeStatusCode,                                  WINHTTP_NO_HEADER_INDEX))         {             delete [] lpOutStatusCode;             return FALSE;         }          dwHTTPStatusCode = _wtoi(lpOutStatusCode);          delete [] lpOutStatusCode;          if(HTTP_STATUS_OK == dwHTTPStatusCode)         {             return TRUE;         }     }      return FALSE; }  string DownloadWebPage(LPCTSTR lpsz_server_ip, LPCTSTR lpsz_src_file_path) {      string result_buffer = "";     DWORD dwReturnHTTPStatusCode;     errno_t err         = 0;      int iResolveTimeout = 5000;     int iConnectTimeout = 5000;     int iSendTimeout    = 60000;     int iReceiveTimeout = 60000;      HINTERNET  hSession = NULL;     HINTERNET  hConnect = NULL;     HINTERNET  hRequest = NULL;      // 1. create session     hSession = WinHttpOpen(L"WinHTTP download/1.0", WINHTTP_ACCESS_TYPE_NO_PROXY, WINHTTP_NO_PROXY_NAME, WINHTTP_NO_PROXY_BYPASS, 0);     if(!hSession)     {         goto __Exit__;     }      // 2. set timeout     if(!WinHttpSetTimeouts(hSession, iResolveTimeout, iConnectTimeout, iSendTimeout, iReceiveTimeout))     {         goto __Exit__;     }          // 3. connect     hConnect = WinHttpConnect( hSession, lpsz_server_ip, INTERNET_DEFAULT_HTTP_PORT, 0);     if(!hConnect)     {         goto __Exit__;     }      // 4. open request     hRequest = WinHttpOpenRequest( hConnect, L"GET", lpsz_src_file_path,NULL, WINHTTP_NO_REFERER, WINHTTP_DEFAULT_ACCEPT_TYPES, WINHTTP_FLAG_REFRESH);     if(!hRequest)     {         goto __Exit__;     }      // 5. send request     if(!WinHttpSendRequest( hRequest,WINHTTP_NO_ADDITIONAL_HEADERS, 0,WINHTTP_NO_REQUEST_DATA, 0, 0, 0))     {         goto __Exit__;     }      // 6. receive response     if(!WinHttpReceiveResponse( hRequest, NULL))     {         goto __Exit__;     }      // 6.1 query header     if(!CheckStatus(hRequest, dwReturnHTTPStatusCode))     {         goto __Exit__;     }      DWORD dwSize = 0;     do      {         // 7. check for available data.         dwSize = 0;         if (!WinHttpQueryDataAvailable( hRequest, &dwSize))         {             goto __Exit__;         }          if(0 == dwSize)         {             break;         }          // Allocate space for the buffer.         char *pszOutBuffer = new char[dwSize+1];         if (!pszOutBuffer)         {             goto __Exit__;         }         else         {             // 8. read the data.             ZeroMemory(pszOutBuffer, dwSize+1);             DWORD dwDownloaded;             if (!WinHttpReadData( hRequest, (LPVOID)pszOutBuffer, dwSize, &dwDownloaded))             {                 delete [] pszOutBuffer;                 goto __Exit__;             }             else             {                 pszOutBuffer[dwDownloaded] = 0 ; // insert the null terminator.                 result_buffer.append(string(pszOutBuffer));             }              // Free the memory allocated to the buffer.             delete [] pszOutBuffer;         }      } while (dwSize>0);  __Exit__:      if (hRequest) WinHttpCloseHandle(hRequest);     if (hConnect) WinHttpCloseHandle(hConnect);     if (hSession) WinHttpCloseHandle(hSession);      return result_buffer; }  string FBPageLink::GetPageSource() {     wstring UnicodePageURL;     UnicodePageURL.assign(PageURL.begin(), PageURL.end());     return DownloadWebPage((LPCTSTR)L"www.facebook.com", (LPCTSTR)UnicodePageURL.c_str()); }  #else   #include <wininet.h>  #pragma comment ( lib, "Wininet.lib" )  string FBPageLink::GetPageSource() {   HINTERNET hInternet = InternetOpenA("InetURL/1.0", INTERNET_OPEN_TYPE_PRECONFIG, NULL, NULL, 0 );   HINTERNET hConnection = InternetConnectA( hInternet, "www.facebook.com", 80, " "," ", INTERNET_SERVICE_HTTP, 0, 0 );   HINTERNET hData = HttpOpenRequestA( hConnection, "GET", PageURL.c_str(), NULL, NULL, NULL, INTERNET_FLAG_KEEP_CONNECTION, 0 );    const int BUFFER_SIZE = 64 * 1024;   string result_buffer = "";   char buf[BUFFER_SIZE + 128];    HttpSendRequestA( hData, NULL, 0, NULL, 0 ) ;    DWORD bytesRead = 0 ;   DWORD totalBytesRead = 0 ;   // http://msdn.microsoft.com/en-us/library/aa385103(VS.85).aspx   // To ensure all data is retrieved, an application must continue to call the   // InternetReadFile function until the function returns TRUE and the   // lpdwNumberOfBytesRead parameter equals zero.    while( InternetReadFile( hData, buf, BUFFER_SIZE, &bytesRead ) && bytesRead != 0 )   {     buf[ bytesRead ] = 0 ; // insert the null terminator.     result_buffer.append(string(buf));     totalBytesRead += bytesRead ;   }   InternetCloseHandle( hData ) ;   InternetCloseHandle( hConnection ) ;   InternetCloseHandle( hInternet ) ;    return result_buffer; }  #endif

FBPageLinkStack.h

// FBScanner // by Stefano Tommesani (www.tommesani.com) 2013 // this code is release under the Code Project Open License (CPOL) http://www.codeproject.com/info/cpol10.aspx // The main points subject to the terms of the License are: // -   Source Code and Executable Files can be used in commercial applications; // -   Source Code and Executable Files can be redistributed; and // -   Source Code can be modified to create derivative works. // -   No claim of suitability, guarantee, or any warranty whatsoever is provided. The software is provided "as-is". // -   The Article(s) accompanying the Work may not be distributed or republished without the Author's consent  #pragma once  #include <stack> #include <boost\thread.hpp>  #include "FBPageLink.h"  class FBPageLinkStack { public:     FBPageLinkStack(void);     ~FBPageLinkStack(void);     FBPageLink *Pop();     void Push(FBPageLink *NewPageLink); private:     std::stack<FBPageLink *>PageLinks;     boost::mutex PageLinksAccess; };

FBPageLinkStack.cpp

// FBScanner // by Stefano Tommesani (www.tommesani.com) 2013 // this code is release under the Code Project Open License (CPOL) http://www.codeproject.com/info/cpol10.aspx // The main points subject to the terms of the License are: // -   Source Code and Executable Files can be used in commercial applications; // -   Source Code and Executable Files can be redistributed; and // -   Source Code can be modified to create derivative works. // -   No claim of suitability, guarantee, or any warranty whatsoever is provided. The software is provided "as-is". // -   The Article(s) accompanying the Work may not be distributed or republished without the Author's consent  #include "FBPageLinkStack.h"  FBPageLinkStack::FBPageLinkStack(void) { }   FBPageLinkStack::~FBPageLinkStack(void) { }  FBPageLink *FBPageLinkStack::Pop() {     FBPageLink *ReturnedItem = nullptr;     PageLinksAccess.lock();     try     {         if (!PageLinks.empty())         {             ReturnedItem = PageLinks.top();             PageLinks.pop();         }     } catch (...) {};     PageLinksAccess.unlock();     return ReturnedItem; }  void FBPageLinkStack::Push(FBPageLink *NewPageLink) {     if (NewPageLink == nullptr)         return;     PageLinksAccess.lock();     try     {         PageLinks.push(NewPageLink);     } catch (...) {};     PageLinksAccess.unlock(); }

FBProfile.h

// FBScanner // by Stefano Tommesani (www.tommesani.com) 2013 // this code is release under the Code Project Open License (CPOL) http://www.codeproject.com/info/cpol10.aspx // The main points subject to the terms of the License are: // -   Source Code and Executable Files can be used in commercial applications; // -   Source Code and Executable Files can be redistributed; and // -   Source Code can be modified to create derivative works. // -   No claim of suitability, guarantee, or any warranty whatsoever is provided. The software is provided "as-is". // -   The Article(s) accompanying the Work may not be distributed or republished without the Author's consent  #pragma once  #include <string> #include <boost/serialization/nvp.hpp> // "name-value pair"  using namespace std; using namespace boost::archive;       // namespace for archives using namespace boost::serialization; // namespace for make_nvp  class FBProfile { public:     FBProfile(const string& _UserName, const string& _UserURL);     ~FBProfile(void); private:     friend class boost::serialization::access;     string UserName;     string UserURL;     template<typename Archive>void serialize(Archive& ar, const unsigned int version)     {         using boost::serialization::make_nvp;         ar & make_nvp("Name", UserName);             ar & make_nvp("URL", UserURL);     }; };

FBProfile.cpp

// FBScanner // by Stefano Tommesani (www.tommesani.com) 2013 // this code is release under the Code Project Open License (CPOL) http://www.codeproject.com/info/cpol10.aspx // The main points subject to the terms of the License are: // -   Source Code and Executable Files can be used in commercial applications; // -   Source Code and Executable Files can be redistributed; and // -   Source Code can be modified to create derivative works. // -   No claim of suitability, guarantee, or any warranty whatsoever is provided. The software is provided "as-is". // -   The Article(s) accompanying the Work may not be distributed or republished without the Author's consent  #include "FBProfile.h"  FBProfile::FBProfile(const string& _UserName, const string& _UserURL) {     UserName = _UserName;     UserURL = _UserURL; }  FBProfile::~FBProfile(void) { }

FBProfileSerializer.h

// FBScanner // by Stefano Tommesani (www.tommesani.com) 2013 // this code is release under the Code Project Open License (CPOL) http://www.codeproject.com/info/cpol10.aspx // The main points subject to the terms of the License are: // -   Source Code and Executable Files can be used in commercial applications; // -   Source Code and Executable Files can be redistributed; and // -   Source Code can be modified to create derivative works. // -   No claim of suitability, guarantee, or any warranty whatsoever is provided. The software is provided "as-is". // -   The Article(s) accompanying the Work may not be distributed or republished without the Author's consent  #pragma once  #include <string> #include <boost/archive/xml_oarchive.hpp> // Archive for writing XML #include <boost/archive/xml_iarchive.hpp> // Archive for reading XML #include <boost/thread.hpp>  using namespace std; using namespace boost::archive;       // namespace for archives using namespace boost::serialization; // namespace for make_nvp  #include "FBProfile.h"  class FBProfileSerializer { public:     FBProfileSerializer(string FileName);     ~FBProfileSerializer(void);     void Serialize(const FBProfile& Profile); private:     ofstream *ProfileStream;     xml_oarchive *XMLProfileStream;     boost::mutex SerializerAccess; };

FBProfileSerializer.cpp

// FBScanner // by Stefano Tommesani (www.tommesani.com) 2013 // this code is release under the Code Project Open License (CPOL) http://www.codeproject.com/info/cpol10.aspx // The main points subject to the terms of the License are: // -   Source Code and Executable Files can be used in commercial applications; // -   Source Code and Executable Files can be redistributed; and // -   Source Code can be modified to create derivative works. // -   No claim of suitability, guarantee, or any warranty whatsoever is provided. The software is provided "as-is". // -   The Article(s) accompanying the Work may not be distributed or republished without the Author's consent  #include "FBProfileSerializer.h"  #include <fstream>  FBProfileSerializer::FBProfileSerializer(string FileName) {     ProfileStream = new ofstream(FileName);     XMLProfileStream = new xml_oarchive(*ProfileStream); }  FBProfileSerializer::~FBProfileSerializer(void) {     ProfileStream->close();     delete XMLProfileStream;     delete ProfileStream; }  void FBProfileSerializer::Serialize(const FBProfile& Profile) {     SerializerAccess.lock();     try     {         *XMLProfileStream << make_nvp("Profile", Profile);     } catch (...) {     }     SerializerAccess.unlock(); }