Banks Exploritory and Analysing

  • Muneers Al Rashiad
  • Nethar Al Otaibi
  • Amna Burshid

Introduction

Exploring the followers and user of three banks they are Boubyan Bank, Kuwait Financial House and National Bank of Kuwait, through their Twitter accounts after that analysing each bank

Method Used

We chose Twitter to explore and analyze the data, tweepy was the tool to gather three banks for our project which were Boubyan bank, KFH and NBK, also another source added to gather Net Profit for each bank from (Bayanati.com) website

We gathered three data for each bank:

  • Followers

      Each of the team member got a specific bank, and each gathered 50,000 rows and 28 columns
      We Combined the three dataframes under the column Bank Name
      The total rows that was gathered is 150,000 rows but we foundout there were 150,028 rows
      After clearing the 28 records the amount of the rows decresed
  • User Timeline

      Each team member gathered 3,000+ rows and 15 columns
      Tweepy gave a limit number of tweets to collect
      It was a clean dataframe 
  • Get User

      Each bank got one row and 4 columns

Null Issue:

Notice how the null values are 28 in few columns

    contributors_enabled                 0
    created_at                          11
    description                     103282
    favourites_count                    28
    followers_count                     28
    friends_count                       28
    geo_enabled                         28
    id                                  28
    id_str                              28
    lang                                40
    listed_count                        28
    location                        105031
    name                                45
    profile_background_color            28
    profile_background_image_url    119970
    profile_background_tile             28
    profile_image_url                   28
    profile_text_color                  28
    profile_use_background_image        28
    protected                           28
    screen_name                         44
    statuses_count                      43
    time_zone                       139239
    url                             142275
    utc_offset                      139240
    verified                            45
    Bank_Name                            0
    dtype: int64

Libararies

- Pandas
- Altair
- Matplotlip
- Numpy

Results

  • Followers
Variable Variable definition Data type Missing data report Report on the distribution of the data level of analysis
contributors_enabled Indicates that the user has an account with “contributor mode” enabled, allowing for Tweets issued by the user to be co-authored by another account. Rarely true Boolean 0 the column consists of 8310 unique values User Tweets
created_at The UTC datetime that the user account was created on Twitter Categorical(object) 0 the column consists of 7116 unique values Entities
description The user-defined UTF-8 string describing their account Continues(float64) 103271 The minimum number 0.0 the maximum is 1617.0, and by ploting it Boubyan bank got the highest number of favourites on their tweets User Tweets
favourites_count The number of Tweets this user has liked in the account’s lifetime Continues(float64) 0 the column consists of 1 unique value, Boubyan Bank got the most likes on their tweets as showed in the Pie Chart User Tweets
followers_count The number of followers this account currently has. Continues(float64) 0 the column consists NaN GEO
friends_count The number of users this account is following (AKA their “followings”) continues(float64) 0 The column represents the the number of tweets User Tweet
geo_enabled When true, indicates that the user has enabled the possibility of geotagging their Tweets. This field must be true for the current user to attach geographic data when using POST statuses / update Boolean 0 The column consists 5038 unique values, and Boubyan bank has the highest replyes among the other banks using Pie Chart Reply to screen name
id The integer representation of the unique identifier for this User. Int64 0 The column consists 3 unique values, the top language among 3 banks in the Arabic using Bar Chart User Tweets
id_str The string representation of the unique identifier for this User object 0 The minimum is 0.0 and the maximum is 692.0 number of retweets Users
lang The BCP 47 code for the user’s self-declared user interface language. May or may not have anything to do with the content of their Tweets Categorical(object) 0 The column consist 1 unique value, Boubyan bank exceeded among the other banks Users
listed_count The number of public lists that this user is a member of Continues(float64) 0 The column consists 8 unique values, Boubyan bank and NBK are mostly using Lithium Tech. and KFH moslty using Hootsuite Users
location The user-defined location for this account’s profile. Not necessarily a location, nor machine-parseable. This field will occasionally be fuzzily interpreted by the Search service Categorical(object) 105003 The column consists 9501 unique values User Tweets
name The name of the user, as they’ve defined it. Not necessarily a person’s name. Typically capped at 20 characters, but subject to change Categorical(object) 7 The column consists 74 unique values, the most active banks is KFH by using Bar Chart User Tweets
profile_background_color The hexadecimal color chosen by the user for their background Categorical(object) 0 1862 unique values, by analysing url columns by using Bar Chart it shows that KFH has the highest number of urls User Tweets
profile_background_image_url A HTTP-based URL pointing to the background image the user has uploaded for their profile Object 119942 It contains 1226 unique values User Tweets
profile_background_tile When true, indicates that the user’s profile_background_image_url should be tiled when displayed Boolean 0 4 unique values Users Tweets
profile_image_url A HTTP-based URL pointing to the user’s profile image. object 0 68805 unique values Users Tweets
profile_text_color The hexadecimal color the user has chosen to display text with in their Twitter UI object 0 289 unique values, most color code 333333 which is black User Tweets
profile_use_background_image When true, indicates the user wants their uploaded background image to be used Boolean 0 16 unique values Users Tweets
protected When true, indicates that this user has chosen to protect their Tweets Boolean 0 It consists 15 unique values, the bank with the highest protection is Boubyan bank by 14.4% and the less is NBK by 6.4% User Tweets
screen_name The screen name, handle, or alias that this user identifies themselves with. screen_names are unique but subject to change object 0 134665 unique values Users Tweets
statuses_count The number of Tweets (including retweets) issued by the user object 0 16901.0 unique values Users Tweets
time_zone A string describing the Time Zone this user declares themselves within Categorical(object) 139195 107 unique values, most using in Kuwait and Boubyan bank is the highest among other banks Location
url A URL provided by the user in association with their profile. object 142247 6839 unique values Users Tweets
utc_offset The offset from GMT/UTC in seconds integer 139195 Minimum -39600.0000 and Maximum 50400.0000 Location
verified When true, indicates that the user has a verified account. Boolean 0 2 unique values Users Tweets
Bank_Name bank's name object 0 3 unique values, it represents the 3 banks Boubyan, KFH and NBK Users Tweets
  • User Timeline
Variable Variable definition Data type Missing data report Report on the distribution of the data level of analysis
Created_at Time and Date of the tweet by the bank Categorical(object) 0 the column consists of 8310 unique values User Tweets
Entities Contains hashtags, urls, media, user mentions, symbols from tweets Categorical(object) 0 the column consists of 7116 unique values Entities
Favorite_count Number of likes on users' tweets Continues(float64) 0 The minimum number 0.0 the maximum is 1617.0, and by ploting it Boubyan bank got the highest number of favourites on their tweets User Tweets
Favorited Whether the tweets in liked or not Boolean 0 the column consists of 1 unique value, Boubyan Bank got the most likes on their tweets as showed in the Pie Chart User Tweets
GEO Similar to coordinates, it represents the geographic location of this Tweet Continues(float64) 9676 the column consists NaN GEO
ID It represents a unique integer identifire of each tweet continues(float64) 0 The column represents the the number of tweets User Tweet
In_reply_to_screen_name The screen names of people replying to the users Categorical(object) 1688 The column consists 5038 unique values, and Boubyan bank has the highest replyes among the other banks using Pie Chart Reply to screen name
Lang Represents the language of the tweets Categorical(object) 0 The column consists 3 unique values, the top language among 3 banks in the Arabic using Bar Chart User Tweets
Retweet_count The number of times the tweet was retweeted Continues(float64) 0 The minimum is 0.0 and the maximum is 692.0 number of retweets Users
Retweeted Represents whether the tweets are retweeted or not by True and False Boolean 0 The column consist 1 unique value, Boubyan bank exceeded among the other banks Users
Source Different devices used to post the tweets and keep track with Categorical(object) 0 The column consists 8 unique values, Boubyan bank and NBK are mostly using Lithium Tech. and KFH moslty using Hootsuite Users
Text The text writting by the users to tweet any text, hashtags, using symboles, media and urls Categorical(object) 0 The column consists 9501 unique values User Tweets
Hashtags It is use to make some tweets easy to search by adding a (#) symbol on a word or short scentence Categorical(object) 9336 The column consists 74 unique values, the most active banks is KFH by using Bar Chart User Tweets
URLs Is the source of differnt websites for the banks or media websites Categorical(object) 7323 1862 unique values, by analysing url columns by using Bar Chart it shows that KFH has the highest number of urls User Tweets
Profit The Net Profit of each bank Continues(float64) 0 The minimum profit is -1.180 (in Million Dollars) and the maximum is 1.191 (in Million Dollars) Banks and Year
  • Get User
Bank Name Created_at Name Location
KFH 09/09/2009 Kuwait Financial House Kuwait
NBK 15/09/2009 National Bank of Kuwait Kuwait
Boubyan 02/02/2010 Boubyan Bank Kuwait

Followers Inshight:

  • we wanted to see the bank's followers created at accounts: KFH Followers Boubyan Followers NBK Followers

Question:

  • why 2017? because Tweepy downloads the last followers of each bank so it makes sense that these number occur.

Notice:

  • we looked into each column in the df to see if any unique data appears.

Bank Boubyan Protected accounts:

Bank NBK Protected accounts

Bank KFH Protected accounts

  • by comparing the Protected accounts in each bank we notice that the first graph which is bank boubyan has the most followers with protected account than other bank's followers
  • in terms of languages we notice that all banks have the most followers that use an arabic language in their devices second comes english and third is en-gb which is british english

  • we looked at the rare languages and we saw that most rare languages were found for the KFH followers

  • by looking at the time_zones that were enabled to some not all followers of all banks we notice that the 3 banks that have the most followers that were present in kuwait and second is US & Canada third is Baghdad
  • we also notice that bank boubyan which is the blue color has the most most followers that enabled that time zone in their accounts than the other banks.

Followers Description

  • in terms of description boubyan has the most followers that wrote their description

followerlocation

  • also the location that users write for themselves are the highest for boubyan followers.

followername

  • we can notice here that boubyan followers that dont have a name are the highest which means NBK is the most accounts that has followers that write their names.
  • the same goes for the utc offset and the urls found for followers that boubyan has the highest in both these columns between the banks.

followerBG1

  • by looking at the profile backgound colors we notice that boubyan followers have the most highest unqiue colors than the other banks

FollowersBG2

  • we took the top colors and to see what were they F5F8FA is Light gray to white which is the default color for backrounds while C0DEED is Light blue.
  • we wanted to see which accounts have the most deafult color

bg_uniqu

conclusion for this graph:

  • since F5F8FA is the default color for a profile we notice that NBK followers have that color the most and close to it comes KFH while the least is boubyan we can conclude from this:
  • changing this profile background color requires a PC not a mobile so this tells us that boubyan followers use PC the most to change their profile backghroud since in the graph above it says it's the most bank with unique colors is boubyan.
  • NBK users are the least people who used a PC in any sort to change their background profile color while boubyan is the most.

Followers that tweet the most:

  • this graph shows that boubyan's followers tweet double the amount of both banks combined which show that boubyan followers are active the most when it comes to tweets unlinke the least bank which is NBK their followers tweet the least out of all 3.

Followers twitter activity:

follower2017

follower2017-2

follower2017-3

Conclusion

  • from these 3 graphs above we notice most activities lie with bank boubyan followers their followers have the most favoriates count and followers count and followings count whic tells us their followers are the most active.

Question

  • now that we saw bank boubyan followers are the most in terms of twitter activity, enabling locations and changing colors we want to know why? why are boubyan followers the most in having these activities than the other banks so in the next dataframe for the users we'll analyze the banks to see what's unique about bank boubyan accounts and what's difference in other accounts.

Users Insights:

Bank Name Created_at Name Location
KFH 09/09/2009 Kuwait Financial House Kuwait
NBK 15/09/2009 National Bank of Kuwait Kuwait
Boubyan 02/02/2010 Boubyan Bank Kuwait

Notices:

  • we wanted to see if there was a relation between their social media twitter and their profits we couldn't get the information that we wanted unfortunately because twitter doesn't provide that feature so we wanted to see if after them making their accounts did the profits change with time?
  • we can notice that Boubyan Bank created their account in 2010 and their profitls increased dramastically after that.
  • Bank KFH created their account 2009 and it didn't make changes that much to their profit.
  • Bank NBK created their account 2009 and their profits increased with time same as boubyan.
  • Number of NBK Tweets in a Month
  • Number of Boubyan Bank Tweets in a Month
  • Number of KFH Tweets in a Month

Tweets Activity

  • we notice that each bank's tweet activity increase at the end of the months of the year and the highest of them all is boubyan tweeting the more 1K on november.

Comparing to see which bank tweets the most during this year:

  • we can notice that boubyan tweets more than the other by slighting a decimal which isn't a big deal.

between those tweets which tweets gets more favorited by followers?

  • also boubyan takes the lead in this one

Banks URLs

  • from this graph we notice that the most bank that posts urls is KFH then boubyan and the last is NBK

Banks that Reply to their followers the most:

  • again Boubyan and KFH are very close and similar while NBK is falling behind and don't interact much with their followers as much as the other banks.

Banks that frequently use their hashtages the most:

Notice:

  • here is a signeficant notice that bank boubyan tweets the alot of their hashtags than the both 2 banks combined so we wanted to see which hashatges they tweet the most

Comparing the banks with Hashtags:

boubyan hashtags

  • they use BoubyanService and بوبيان كل ثلاثاء frequently

KFH hashtags

  • we notice here than bank KFH tweets a veriaty of different hashtages that are more than bank boubyan but not as frequent as them

NBK hashtags

  • bank NBK tweets the least number of hastages and the least frequent times.

Comparing retweets of hashatges made by Bank accounts:

  • we notice that bank boubyan gets the most retweets for their hashtages because of their active followers.

Let's see which hashtages gets the most retweets:

Boubyan Bank

KFH

NBK Notice:

  • we noticed the most retweets for bank boubyan are for the hashtages that talk about prizes and draws and lottery winners they market their tweets by asking for retweets to pick winners and for their to be interactivity between them and their followers.

Notice:

  • we can notice that most most's tweets are in arabic but the NBK tweets in english more than the other banks.

Looking into which source these accounts tweet from:

Boubyan Bank

KFH

NBK

Notice:

  • Boubyan and NBK both tweets the most using lithium tech while KFH uses Hootsuite

Definition:-

  • Hootsuite is a social media management platform, The system’s user interface takes the form of a dashboard, and supports social network integrations for Twitter, Facebook, etc

  • Lithium Technologies is a San Francisco-based provider of software that allows businesses to connect with their customers on social media and digital channels.

Conclusion:

  • From analysing each bank's activity we can see from all the graphs above bank boubyan tweets the most, uses frequent hashtages and gets the most retweets.
  • bank boubyan replies the most to their followers so their relationship with their followers are stronger than other banks which is why they gain followers that are active becuase their account is active itself making hashtages offering prizes to their followers and such.