twitter - spliting hashtags in a data.frame object with R -

i collecting twitter's hashtags. each tweet can include hashtags.

tests <- c("xxxxxx #savethedate xxxxxx #histoire] xxxxxx #femmes xxxxxxx #ports",        "xxxxxxxxxxxx",        "xxxx #rock xxxxxx #nantes" ,        "xxxxxx #lvan xxxxxxx #nantes xxxxx #ilsepassetoujoursuntruc")   library (stringr)  hashtags <- str_extract_all(tests, "#\\s+")  str (hashtags) 

ma results:

 str(hashtags)    list of 4  $ : chr [1:4] "#savethedate" "#histoire]" "#femmes" "#ports"    $ : chr(0)    $ : chr [1:2] "#rock" "#nantes"    $ : chr [1:3] "#lvan" "#nantes" "#ilsepassetoujoursuntruc"   

what expect: data.frame 1 hashtag row

 "#savethedate"   "#histoire"  "#femmes"    "#ports"   na   .... 

what tried:

hashtags_df <  

hashtags[!lengths(hashtags)] <- na 

this replace length 0 lists nas. (better solution via dirty sock sniffer)

hashtags <- unlist(hashtags) 

will give column vector of values. if you'd dataframe, can use now.

hashtags_df <- 

i don't know best way extract hashtags, etc., should answer question asked.


Popular posts from this blog

java - Jasper subreport showing only one entry from the JSON data source when embedded in the Title band -

serialization - Convert Any type in scala to Array[Byte] and back -

SonarQube Plugin for Jenkins does not find SonarQube Scanner executable -