twitter - spliting hashtags in a data.frame object with R -


i collecting twitter's hashtags. each tweet can include hashtags.

tests <- c("xxxxxx #savethedate xxxxxx #histoire] xxxxxx #femmes xxxxxxx #ports",        "xxxxxxxxxxxx",        "xxxx #rock xxxxxx #nantes" ,        "xxxxxx #lvan xxxxxxx #nantes xxxxx #ilsepassetoujoursuntruc")   library (stringr)  hashtags <- str_extract_all(tests, "#\\s+")  str (hashtags) 

ma results:

 str(hashtags)    list of 4  $ : chr [1:4] "#savethedate" "#histoire]" "#femmes" "#ports"    $ : chr(0)    $ : chr [1:2] "#rock" "#nantes"    $ : chr [1:3] "#lvan" "#nantes" "#ilsepassetoujoursuntruc"   

what expect: data.frame 1 hashtag row

 "#savethedate"   "#histoire"  "#femmes"    "#ports"   na   .... 

what tried:

hashtags_df <-as.data.frame(hashtags)  

hashtags[!lengths(hashtags)] <- na 

this replace length 0 lists nas. (better solution via dirty sock sniffer)

hashtags <- unlist(hashtags) 

will give column vector of values. if you'd dataframe, can use as.data.frame now.

hashtags_df <- as.data.frame(hashtags) 

i don't know best way extract hashtags, etc., should answer question asked.


Comments

Popular posts from this blog

serialization - Convert Any type in scala to Array[Byte] and back -

matplotlib support failed in PyCharm on OSX -

python - Matplotlib: TypeError: 'AxesSubplot' object is not callable -