r - For each row of dataframe check if duplicate values exist -


i have dataframe contains following values:

url                  response.code count www.site.com/page1   200             4 www.site.com/page1   301             1 www.site.com/page2   200             5 www.site.com/page3   301             4 www.site.com/page4   200             4 www.site.com/page4   403             1 

for each unique value of url want know if multiple values of response.code exist. if 1 combination url/response.code exist url consistent. desired output data frame this:

  url                  consistent   www.site.com/page1   false   www.site.com/page2   true   www.site.com/page3   true   www.site.com/page4   false   

i loop each of unique url's , check number of different values in response.code, doesn't r way solve this.

any suggestions on best way solve this? i'm new r & checked multiple questions on duplicates here didn't seem find solution particular issue.

you can use base r aggregate

aggregate(response.code~url, df, length)[2] == 1  #     response.code #[1,]         false #[2,]         true #[3,]         true #[4,]         false 

if want output in required format can,

agg <- aggregate(response.code~url, df, length) new_df <- data.frame(url = agg$url, consistent = agg$response.code == 1) new_df #    url               consistent #1 www.site.com/page1      false #2 www.site.com/page2      true #3 www.site.com/page3      true #4 www.site.com/page4      false 

Comments

Popular posts from this blog

serialization - Convert Any type in scala to Array[Byte] and back -

matplotlib support failed in PyCharm on OSX -

python - Matplotlib: TypeError: 'AxesSubplot' object is not callable -