r - For each row of dataframe check if duplicate values exist -
i have dataframe contains following values:
url response.code count www.site.com/page1 200 4 www.site.com/page1 301 1 www.site.com/page2 200 5 www.site.com/page3 301 4 www.site.com/page4 200 4 www.site.com/page4 403 1
for each unique value of url want know if multiple values of response.code exist. if 1 combination url/response.code exist url consistent. desired output data frame this:
url consistent www.site.com/page1 false www.site.com/page2 true www.site.com/page3 true www.site.com/page4 false
i loop each of unique url's , check number of different values in response.code, doesn't r way solve this.
any suggestions on best way solve this? i'm new r & checked multiple questions on duplicates here didn't seem find solution particular issue.
you can use base r
aggregate
aggregate(response.code~url, df, length)[2] == 1 # response.code #[1,] false #[2,] true #[3,] true #[4,] false
if want output in required format can,
agg <- aggregate(response.code~url, df, length) new_df <- data.frame(url = agg$url, consistent = agg$response.code == 1) new_df # url consistent #1 www.site.com/page1 false #2 www.site.com/page2 true #3 www.site.com/page3 true #4 www.site.com/page4 false
Comments
Post a Comment