☆☆ 新着記事 ☆☆

2019年5月9日木曜日

tweepyでreply tweetにアクセスする方法

user.user_timelineで取得できるデータを再確認する。

tweetした人
tweets = extractor.user_timeline(screen_name="realDonaldTrump", count=30)



◇ Replyのデータを取得する

How to get the replies for a given tweet with tweepy and python?

https://stackoverflow.com/questions/52307443/how-to-get-the-replies-for-a-given-tweet-with-tweepy-and-python

(Sep 13 '18 at 5:55)

This code will fetch 10 recent tweets of an user(name) along with the replies to that particular tweet.


これで取得できます。 

<tweepyで接続した後で>


replies=[]

non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd) for full_tweets in tweepy.Cursor(api.user_timeline,screen_name='tartecosmetics',timeout=999999).items(10):

  for tweet in tweepy.Cursor(api.search,q='to:'+'tartecosmetics',result_type='recent',timeout=999999).items(1000):

    if hasattr(tweet, 'in_reply_to_status_id_str'):

      if (tweet.in_reply_to_status_id_str==full_tweets.id_str):

        replies.append(tweet.text)

  print("Tweet :",full_tweets.text.translate(non_bmp_map))

  for elements in replies:

       print("Replies :",elements)

  replies.clear()



解説)

dict.fromkeys():creates a new dictionary from the given sequence of elements with a value provided by the user.

dictionary.fromkeys(sequence[, value])

例)
# vowels keys
keys = {'a', 'e', 'i', 'o', 'u' }
value = [1]
vowels = dict.fromkeys(keys, value)
print(vowels)

{'a': [1], 'u': [1], 'o': [1], 'e': [1], 'i': [1]}

https://www.programiz.com/python-programming/methods/dictionary/fromkeys

0x10000:Numbers that begin with 0x are interpreted as hexadecimal (base 16) in C.
So, 0x10000 is 16*16*16*16 = 16^4 = 65536
sys.maxunicode: Unicode コードポイントの最大値を示す整数、すなわち 1114111 (16 進数で 0x10FFFF) です。

0xfffd: UTF-8の文字表示



**以下代表的、Logフォーマットなので面倒。。

Code:

user_name = "@nameofuser"

replies = tweepy.Cursor(api.search, q='to:{}'.format(user_name),

                                since_id=tweet_id, tweet_mode='extended').items() while True:

    try:

        reply = replies.next()

        if not hasattr(reply, 'in_reply_to_status_id_str'):

            continue

        if reply.in_reply_to_status_id == tweet_id:

           logging.info("reply of tweet:{}".format(reply.full_text))



    except tweepy.RateLimitError as e:

        logging.error("Twitter api rate limit reached".format(e))

        time.sleep(60)

        continue



    except tweepy.TweepError as e:

        logging.error("Tweepy error occured:{}".format(e))

        break



    except StopIteration:

        break



    except Exception as e:

        logger.error("Failed while fetching replies {}".format(e))

        break





◇Retweet を取得する。





◇Tweet Object

https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object.html

hasattr():指定のオブジェクトが特定の属性を持っているかを確認するhasattr関数.

属性を持っているとTrue、属性は持っていないためFalse値を返す。

since_id:

in_reply_to_status_id_str: Nullable. If the represented Tweet is a reply, this field will contain the string representation of the original Tweet’s ID. Example:

logging.info/ logging.error: Log情報の記録。

https://qiita.com/__init__/items/91e5841ed53d55a7895e

import logging が必要

logging のみの書き方:logging.info('info %s %s', 'test', 'test')



◇Textblob

1) https://www.codementor.io/ferrorodolfo/sentiment-analysis-on-trump-s-tweets-using-python-pltbvb4xr

''' Utility function to clean the text in a tweet by removing links and special characters using regex.'''

    return ' '.join(re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)", " ", tweet).split())

Percentage of positive tweets: 51.0% Percentage of neutral tweets: 27.0% Percentage de negative tweets: 22.0%





(解説)

' '. :区切り文字をつけないで連結。

test = ['ab', 'c', 'de']

result = ''.join(test)

print(result):abcde

result2 = ','.join(test)

print(result2): ab,c,de



.split() : ()内の文字で分割する。

a = "abbaccadda"

sep = a.split("a")

print(sep) : ['', 'bb', 'cc', 'dd', '']

sep2 = a.split("ac")

print(sep2)

['abb', 'cadda']

*区切りに使用した文字列は、戻り値からは消えてしまう。



re.sub():re.sub(正規表現, “置換する文字列”, 置換対象の文字列)

[★] ★のどれか1文字





Python

import re



message="My office email address is test2018@test.com.But my personal one is test.craig@g.co.jp.Please visit our web site @ https://test.test.com. Company information is here at https://test.test.com/about Sincerely, 15 Jan, 2019 "



new_message=' '.join(re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)", " ", message).split())



print(new_message)





Output:

My office email address is test2018 com But my personal one is test craig co jp Please visit our web site Company information is here at Sincerely 15 Jan 2019





My office email address is test2018@test.com. But my personal one is test.craig@g.co.jp.
Please visit our web site @ https://test.test.com.
Company information is here at https://test.test.com/about
Sincerely,
15 Jan, 2019



Python







2) https://www.geeksforgeeks.org/twitter-sentiment-analysis-using-python/

'''Utility function to clean tweet text by removing links, special characters using simple regex statements. '''

    return ' '.join(re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t]) |(\w+:\/\/\S+)", " ", tweet).split()) return ' '.join(re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])

                                    |(\w+:\/\/\S+)", " ", tweet).split())



0 件のコメント:

コメントを投稿