在Synology群晖NAS上使用shadowsocks-libev、polipo

  翻墙一直主要使用的是SSH动态端口映射,SSH代理数据传输的安全性自然没必要担心,但据说有较明显的流量特征,服务器较易暴露,再加上SSH代理在Windows下使用并不那么友好,因此在自己购买了VPS后就决定要用shadowsocks做代理服务器。

  Python版的shadowsocks因作者clowwindy被请喝茶早就停止更新,托开源的福,shadowsock有众多其它语言版本,比如纯C的shadowsocks-libev,较低的资源消耗更适合配置一般的VPS和我那内存仅区区128M的NAS。

  VPS操作系统是Ubuntu,因此服务端不论是自己编译还是从软件源安装都很简单,Windows和Mac OS X上也有客户端可以直接使用,唯一有点麻烦的就是群晖NAS。NAS配置低的惨不忍睹,而且缺少编译环境,只能在其他Linux服务器上进行交叉编译。关于交叉编译,Synology提供了详尽的开发指南可供参考。

一、环境准备

  1. 准备编译服务器,一般常见的Linux发行版都没有问题,但需要注意的是如果是32位的系统需要额外安装libc6-i386,我是在Paralels Desktop虚拟机里安装的Ubuntu 16.04 amd64。

  2. 确定NAS的CPU类型、DSM版本,下载对应的Tool Chains,如我的老爷机DS211j,CPU: Marvell mv6282 DSM: 6.0,需要下载的就是6281-gcc464_glibc215_88f6281-GPL.txz

  3. 上传Tool Chains到编译服务器。

  注意:下文脚本假定工作目录为/home/jinnlynn/nas,Tool Chains也被上传到该目录下

二、编译
# 工作目录,前文下载的6281-gcc464_glibc215_88f6281-GPL.txz在该目录下
WDIR=/home/jinnlynn/nas
# 目标目录,编译好的文件将会保存在这
DIST=$WDIR/dist

cd $WDIR

# 安装工具
sudo apt-get -y install make binutils
# 64位系统上需安装32位libc
uname -p | grep -q 64 && sudo apt -y install libc6-i386

# 解压Tool Chains,将生成arm-marvell-linux-gnueabi文件夹
tar xvf 6281-gcc464_glibc215_88f6281-GPL.txz

# 编译环境变量,不同的CPU环境可能不同,详见Synology开发指南的Compile Open Source Projects章节
export CC=$WDIR/arm-marvell-linux-gnueabi/bin/arm-marvell-linux-gnueabi-gcc
export LD=$WDIR/arm-marvell-linux-gnueabi/bin/arm-marvell-linux-gnueabi-ld
export RANLIB=$WDIR/arm-marvell-linux-gnueabi/bin/arm-marvell-linux-gnueabi-ranlib
export CFLAGS="-I$WDIR/arm-marvell-linux-gnueabi/arm-marvell-linux-gnueabi/libc/include"
export LDFLAGS="-L$WDIR/arm-marvell-linux-gnueabi/arm-marvell-linux-gnueabi/libc/lib"

# 依赖zlib,下载编译
curl -O http://zlib.net/zlib-1.2.8.tar.gz
tar xvf zlib-1.2.8.tar.gz
cd zlib-1.2.8/
./configure --prefix=$DIST/zlib-1.2.8
make install
cd $WDIR

# 依赖openssl,下载编译
curl -O https://www.openssl.org/source/openssl-1.0.2h.tar.gz
tar xvf openssl-1.0.2h.tar.gz
cd openssl-1.0.2h
./Configure dist --prefix=$DIST/openssl-1.0.2h
make
make install
cd $WDIR

# 编译shadowsocks-libev
curl -OL https://github.com/shadowsocks/shadowsocks-libev/archive/v2.4.6.tar.gz
tar xvf v2.4.6.tar.gz
cd shadowsocks-libev-2.4.6
# 配置 需要注意的是--host选项,目标NAS不同值可能也会不同
# 详见Synology开发指南的Compile Open Source Projects章节
./configure \
    --with-zlib=$DIST/zlib-1.2.8 \
    --with-openssl=$DIST/openssl-1.0.2h \
    --prefix=$DIST/shadowsocks-libev-2.4.6 \
    --host=armle-unknown-linux
make
make install
cd $WDIR

  将生成的shadowsocks-libev-2.4.6目录拷贝到NAS即可正常使用了。

三、编译polipo

  shadowsocks-libev是socks5代理,但某些应用可能只能使用HTTP代理,因此需要转换,软件个人推荐polipo,轻量高效,ipkg软件源上也有但版本较低,最好也自己编译。

# 工作目录
WDIR=/home/jinnlynn/nas
# 目标目录,编译好的文件将会保存在这
DIST=$WDIR/dist/

cd $WDIR

# 安装工具
sudo apt-get -y install texinfo

# 编译环境变量,不同的CPU环境可能不同,详见Synology开发指南的Compile Open Source Projects章节
export CC=$WDIR/arm-marvell-linux-gnueabi/bin/arm-marvell-linux-gnueabi-gcc
export LD=$WDIR/arm-marvell-linux-gnueabi/bin/arm-marvell-linux-gnueabi-ld
export RANLIB=$WDIR/arm-marvell-linux-gnueabi/bin/arm-marvell-linux-gnueabi-ranlib
export CFLAGS="-I$WDIR/arm-marvell-linux-gnueabi/arm-marvell-linux-gnueabi/libc/include"
export LDFLAGS="-L$WDIR/arm-marvell-linux-gnueabi/arm-marvell-linux-gnueabi/libc/lib"

curl -OL https://github.com/jech/polipo/archive/polipo-1.1.1.tar.gz
tar xvf polipo-1.1.1.tar.gz
cd polipo-polipo-1.1.1/

make all

# polipo的make install默认将安装到当前系统目录
# 这里自己拷贝出所需文件
PREFIX=$DIST/polipo-1.1.1
mkdir -p $PREFIX/bin
mkdir -p $PREFIX/share/www/doc
rm -f $PREFIX/bin/polipo
cp -f polipo $PREFIX/bin/
cp -f html/* $PREFIX/share/www/doc
cp -f localindex.html $PREFIX/share/www/index.html
mkdir -p $PREFIX/man/man1
mkdir -p $PREFIX/info
cp -f polipo.man $PREFIX/man/man1/polipo.1
cp polipo.info $PREFIX/info/

  同样的拷贝polipo-1.1.1目录到NAS即可,要注意的一点是运行时可能出现Disabling disk cache: No such file or directoryDisabling local tree: No such file or directory提示,这是因为编译默认的缓存目录和本地文档目录在NAS上不一定存在,只要运行时给polipo添加localDocumentRootdiskCacheRoot选项,设置正确的目录就没有问题了。

Read more...

Python脚本: Diablo3排名前1000位玩家英雄使用的技能统计

  说实在的个人对游戏并没有多大的兴趣,但唯独对暴雪的Diablo系列很有感情,去年年初开始玩Diablo3,断断续续,感觉最麻烦的是选择技能,每次版本更新可能都有更优的build,这对于我这样的业余玩家来说可不是件好事,好在宏伟秘境后有了天梯,借鉴排名在前的高级玩家们build总没错,于是花了点时间写了这个脚本。

Diablo3

  脚本只是统计了主动技能、被动技能和传奇宝石的使用情况,理论上统计其它如装备等信息也是一样简单可行的,但Diablo装备的生成机制使得统计这个没有多大意义,相同的装备属性可能各有优劣,难以比较,而且某些装备坑爹的掉率也不是你想要就能有的。

  题外话,不得不说Python太适合写这类功能相对简单的脚本了,一个字:快。

# -*- coding: utf-8 -*-
"""
Diablo3 排名前1000玩家英雄使用技能统计

python diablo.py help
python diablo.py [barbarian|crusader|demon-hunter|monk'|witch-doctor|wizard]

默认使用的是亚服的数据,如果需要美服或欧服,更改`_rank_page`和`_api`变量地址即可

Copyright (c) 2015 JinnLynn <eatfishlin@gmail.com>
Released under the terms of the MIT license.
"""
from __future__ import unicode_literals, print_function, absolute_import
import os
import sys
import urllib2
import json
import re

__version__ = '1.0.0'
__author__ = 'JinnLynn <eatfishlin@gmail.com>'
__license__ = 'The MIT License'
__copyright__ = 'Copyright 2015 JinnLynn'

# 排名页面
_rank_page = 'http://tw.battle.net/d3/zh/rankings/'
# api
_api = 'http://tw.battle.net/api/d3/'
_api_profile = os.path.join(_api, 'profile')
_api_data = os.path.join(_api, 'data')

_hero_classes = {
    'barbarian': '野蠻人', 'crusader': '聖教軍', 'demon-hunter': '狩魔獵人',
    'monk': '武僧', 'witch-doctor': '巫醫', 'wizard': '秘術師'}

_retry = 5

_hero_class = ''
_active_skills = {}
_passive_skills = {}
_unique_gems = {}


def _clear_output(msg=''):
    sys.stdout.write('\r{:30}'.format(' '))
    sys.stdout.write('\r{}'.format(msg))
    sys.stdout.flush()


def _process(stated, total):
    msg = '英雄数据分析中... {}/{}'.format(stated, total)
    _clear_output(msg)


def _get(url, is_json=True):
    # print('GET: ', url)
    retry = 5 if _retry < 1 else _retry
    while retry > 0:
        try:
            req = urllib2.urlopen(url.encode('utf8'), timeout=10)
            return json.load(req) if is_json else req.read()
        except KeyboardInterrupt, e:
            raise e
        except Exception, e:
            retry -= 1
            # print('retry', retry, e)
            # raise e


def _api_url(*args, **kwargs):
    slash = kwargs.get('slash', False)
    args = [unicode(arg) for arg in args]
    url = os.path.join(*args).rstrip('/')
    return url + '/' if slash else url


def get_era():
    req = urllib2.urlopen(_rank_page)
    return req.geturl().split('/')[-2]


def get_rank_page_url(era):
    url_part = 'rift-'
    if _hero_class == 'demon-hunter':
        url_part += 'dh'
    elif _hero_class == 'witch-doctor':
        url_part += 'wd'
    else:
        url_part += _hero_class
    return os.path.join(_rank_page, 'era', era, url_part)


def fetch_rank_list():
    tags = []
    try:
        _clear_output('获取当前游戏纪元...')
        era = get_era()
        _clear_output('获取当前排名前1000的玩家...')
        url = get_rank_page_url(era)
        html = _get(url, is_json=False)
        # re parse
        lst = re.findall(
            r"a href=\"(.*)\" title=.*class=\"icon-profile link-first\">",
            html.decode('utf8'),
            re.UNICODE)
        # BeautifulSoup parse
        # import bs4
        # soup = bs4.BeautifulSoup(html)
        # lst = soup.select('#ladders-table tbody tr .battletag a')['href']
        for item in lst:
            try:
                tags.append(item.split('/')[-2])
            except:
                pass
    except Exception, e:
        print('fetch rank list fail. {}'.format(_rank_page))
        raise e
    return tags


def get_hero(player_tag):
    url = _api_url(_api_profile, player_tag, slash=True)
    data = _get(url)
    hero_selected = None
    for hero in data.get('heroes', []):
        if hero['class'] != _hero_class:
            continue
        last_updated = hero_selected['last-updated']
        # 最近使用的英雄
        if hero_selected is None or last_updated < hero['last-updated']:
            hero_selected = hero
    if not hero_selected:
        raise Exception('{} hero missing.'.format(player_tag))
    url = _api_url(_api_profile, player_tag, 'hero', hero_selected['id'])
    return _get(url)


# 主动技能符文
def stat_active_skill_rune(skill_slug, rune):
    global _active_skills
    if not rune:
        return
    slug = rune.get('slug')
    if slug in _active_skills[skill_slug]['rune']:
        _active_skills[skill_slug]['rune'][slug]['count'] += 1
    else:
        _active_skills[skill_slug]['rune'][slug] = {
            'count': 1,
            'name': rune.get('name')
        }


# 主动技能
def stat_active_skill(active):
    global _active_skills
    slug = active.get('skill', {}).get('slug')
    # d3 API 返回的数据中可能存在空的数据
    if not slug:
        return
    if slug in _active_skills:
        _active_skills[slug]['count'] += 1
    else:
        _active_skills[slug] = {
            'count': 1,
            'name': active.get('skill').get('name'),
            'rune': {}
        }
    stat_active_skill_rune(slug, active.get('rune'))


# 被动技能
def stat_passive_skill(passive):
    global _passive_skills
    slug = passive.get('skill', {}).get('slug')
    # d3 API 返回的数据中可能存在空的数据
    if not slug:
        return
    if slug in _passive_skills:
        _passive_skills[slug]['count'] += 1
    else:
        _passive_skills[slug] = {
            'count': 1,
            'name': passive.get('skill').get('name')
        }


def stat_unique_gem(items):
    global _unique_gems

    def get_gem(tooltip):
        if not tooltip:
            return None, None
        url = _api_url(_api_data, tooltip)
        data = _get(url)
        gems = data.get('gems')
        if not gems:
            return None, None
        gem = gems[0].get('item', {})
        return gem.get('id'), gem.get('name')

    if not items:
        return

    lst = [items.get(s, {}) for s in ['leftFinger', 'rightFinger', 'neck']]
    for tooltip in [d.get('tooltipParams', None) for d in lst]:
        id_, name = get_gem(tooltip)
        if not id_:
            continue
        if id_ in _unique_gems:
            _unique_gems[id_]['count'] += 1
        else:
            _unique_gems[id_] = {
                'count': 1,
                'name': name
            }


def stat(hero):
    global _active_skills, _passive_skills

    map(stat_active_skill, hero.get('skills', {}).get('active', []))
    map(stat_passive_skill, hero.get('skills', {}).get('passive', []))

    items = hero.get('items', {})
    stat_unique_gem(items)


def output(hero_stated, hero_stat_failed):
    def sort(data, count=10):
        d = sorted(data.items(), key=lambda d: d[1]['count'], reverse=True)
        return d if count <= 0 else d[0:count]

    _clear_output()

    # print('======')
    # print(hero_stated, hero_stat_failed)
    # print('======')
    # pprint(_active_skills)
    # print('======')
    # pprint(_passive_skills)
    # print('======')
    # pprint(_unique_gems)
    # pprint(_active_skills.items())
    # print('======')

    print('\n=== RESULT ===\n')
    print('统计英雄数\n')
    print('  成功: {} 失败: {}\n'.format(hero_stated, hero_stat_failed))

    print('主动技能使用排名: ')
    for _, d in sort(_active_skills):
        runes = []
        for _, r in sort(d.get('rune', {})):
            runes.append('{name}[{count}]'.format(**r))
        d.update({'rune_rank': ', '.join(runes)})
        print('  {name}[{count}]: {rune_rank}'.format(**d))
    print()

    print('被动技能使用排名: ')
    for _, d in sort(_passive_skills):
        print('  {name}[{count}]'.format(**d))
    print()

    print('传奇宝石使用排名: ')
    for _, d in sort(_unique_gems):
        print('  {name}[{count}]'.format(**d))
    print()


def prepare():
    global _hero_class

    def print_hc():
        print('仅支持以下英雄类型, 默认 demon-hunter:\n')
        for c, n in _hero_classes.items():
            print(c, ':', n)

    if len(sys.argv) == 1:
        _hero_class = 'demon-hunter'
    elif len(sys.argv) > 2:
        sys.exit('参数错误')
    else:
        arg = sys.argv[1]
        if arg == 'help':
            print_hc()
            print('\nTips: 运行中可随时Ctrl+C终止以获得已统计的数据结果')
            sys.exit()
        elif arg not in _hero_classes:
            print_hc()
            sys.exit()
        else:
            _hero_class = arg


def main():
    prepare()
    print('待分析的英雄类型:', _hero_classes[_hero_class])

    hero_stated = 0
    hero_stat_failed = 0
    try:
        tags = fetch_rank_list()
        if not tags:
            raise Exception('parse battle.net rank page fail.')
    except Exception, e:
        print('error,', e)
        sys.exit()

    total = len(tags)

    for tag in tags:
        try:
            hero = get_hero(tag)
            if not hero:
                raise Exception('no hero data')
            stat(hero)
            hero_stated += 1
            _process(hero_stated, total)
        except KeyboardInterrupt:
            break
        except Exception, e:
            # print('Fail: ', tag, e, hero)
            hero_stat_failed += 1

    output(hero_stated, hero_stat_failed)


if __name__ == '__main__':
    main()

Python脚本: 淘宝消费情况统计

  支付宝十年账单上的数字有点吓人,但它统计的项目太多,只是想看看到底单纯在淘宝上支出了多少,于是写了段脚本,统计任意时间段淘宝订单的消费情况,看那结果其实在淘宝上我还是相当节约的说。

  脚本的主要工作是模拟了浏览器登录,解析“已买到的宝贝”页面以获得指定的订单及宝贝信息。

Taobao

  使用方法见代码或执行命令加参数-h,另外需要BeautifulSoup4支持。

Github Gist

# -*- coding: utf-8 -*-
"""淘宝消费情况统计

使用方法:
    python taobao.py -u USERNAME -p PASSWORD -s START-DATE -e END-DATE --verbose

    所有参数均可选,如:

    python taobao.py -u jinnlynn 
    统计用户jinnlynn所有订单的情况

    python taobao.py -s 2014-12-12 -e 2014-12-12
    统计用户(用户名在命令执行时会要求输入)在2014-12-12当天的订单情况

    python taobao.py --verbose
    统计并输出订单明细

Copyright (c) 2014 JinnLynn <eatfishlin@gmail.com>
Released under the terms of the MIT license.
"""
from __future__ import unicode_literals, print_function, absolute_import, division
import urllib
import urllib2
import urlparse
import cookielib
import re
import sys
import os
import json
import subprocess
import argparse
import platform
from getpass import getpass
from datetime import datetime
from pprint import pprint

try:
    from bs4 import BeautifulSoup
except ImportError:
    sys.exit('BeautifulSoup4 missing.')

__version__ = '1.0.0'
__author__ = 'JinnLynn'
__copyright__ = 'Copyright (c) 2014 JinnLynn'
__license__ = 'The MIT License'

HEADERS = {
    'x-requestted-with' : 'XMLHttpRequest',
    'Accept-Language' : 'zh-cn',
    'Accept-Encoding' : 'gzip, deflate',
    'ContentType' : 'application/x-www-form-urlencoded; chartset=UTF-8',
    'Cache-Control' : 'no-cache',
    'User-Agent' :'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.38 Safari/537.36',
    'Connection' : 'Keep-Alive'
}

DEFAULT_POST_DATA = {
    'TPL_username' : '', #用户名
    'TPL_password' : '', #密码
    'TPL_checkcode' : '',
    'need_check_code' : 'false',
    'callback' : '0', # 有值返回JSON
}

# 无效订单状态
INVALID_ORDER_STATES = [
    'CREATE_CLOSED_OF_TAOBAO', # 取消
    'TRADE_CLOSED', # 订单关闭
]

LOGIN_URL = 'https://login.taobao.com/member/login.jhtml'

RAW_IMPUT_ENCODING = 'gbk' if platform.system() == 'Windows' else 'utf-8'

def _request(url, data, method='POST'):
    if data:
        data = urllib.urlencode(data)
    if method == 'GET':
        if data:
            url = '{}?{}'.format(url, data)
        data = None
    # print(url)
    # print(data)
    req = urllib2.Request(url, data, HEADERS)
    return urllib2.urlopen(req)

def stdout_cr(msg=''):
    sys.stdout.write('\r{:10}'.format(' '))
    sys.stdout.write('\r{}'.format(msg))
    sys.stdout.flush()

def get(url, data=None):
    return _request(url, data, method='GET')

def post(url, data=None):
    return _request(url, data, method='POST')

def login_post(data):
    login_data = DEFAULT_POST_DATA
    login_data.update(data)
    res = post(LOGIN_URL, login_data)
    return json.load(res, encoding='gbk')

def login(usr, pwd):
    data = {
        'TPL_username' : usr.encode('utf-8' if platform.system() == 'Windows' else 'GB18030'),
        'TPL_password' : pwd
    }

    # 1. 尝试登录
    ret = login_post(data)
    while not ret.get('state', False):
        code = ret.get('data', {}).get('code', 0)
        if code == 3425 or code == 1000:
            print('INFO: {}'.format(ret.get('message')))
            check_code = checkcode(ret.get('data', {}).get('ccurl'))
            data.update({'TPL_checkcode' : check_code, 'need_check_code' : 'true'})
            ret = login_post(data)
        else:
            sys.exit('ERROR. code: {}, message:{}'.format(code, ret.get('message', '')))

    token = ret.get('data', {}).get('token')
    print('LOGIN SUCCESS. token: {}'.format(token))

    # 2. 重定向
    # 2.1 st值
    res = get('https://passport.alipay.com/mini_apply_st.js', {
        'site' : '0',
        'token' : token,
        'callback' : 'stCallback4'})
    content = res.read()
    st = re.search(r'"st":"(\S*)"( |})', content).group(1)
    # 2.1 重定向
    get('http://login.taobao.com/member/vst.htm',
        {'st' : st, 'TPL_uesrname' : usr.encode('GB18030')})

def checkcode(url):
    filename, _ = urllib.urlretrieve(url)
    if not filename.endswith('.jpg'):
        old_fn = filename
        filename = '{}.jpg'.format(filename)
        os.rename(old_fn, filename)
    if platform.system() == 'Darwin':
        # mac 下直接preview打开
        subprocess.call(['open', filename])
    elif platform.system() == 'Windows':
        # windows 执行文件用默认程序打开
        subprocess.call(filename, shell=True)
    else:
        # 其它系统 输出文件名
        print('打开该文件获取验证码: {}'.format(filename))
    return raw_input('输入验证码: '.encode(RAW_IMPUT_ENCODING))

def parse_bought_list(start_date=None, end_date=None):
    url = 'http://buyer.trade.taobao.com/trade/itemlist/list_bought_items.htm'
    #                 运费险           增值服务         分段支付(定金,尾款)
    extra_service = ['freight-info', 'service-info', 'stage-item']

    stdout_cr('working... {:.0%}'.format(0))
    # 1. 解析第一页
    res = urllib2.urlopen(url)
    soup = BeautifulSoup(res.read().decode('gbk'))
    # 2. 获取页数相关
    page_jump = soup.find('span', id='J_JumpTo')
    jump_url = page_jump.attrs['data-url']
    url_parts = urlparse.urlparse(jump_url)
    query_data = dict(urlparse.parse_qsl(url_parts.query))
    total_pages = int(query_data['tPage'])

    # 解析
    orders = []
    cur_page = 1
    out_date = False
    errors = []
    while True:
        bought_items = soup.find_all('tbody', attrs={'data-orderid' : True})
        # pprint(len(bought_items))
        count = 0
        for item in bought_items:
            count += 1
            # pprint('{}.{}'.format(cur_page, count))
            try:
                info = {}
                # 订单在页面上的位置 页数.排序号
                info['pos'] = '{}.{}'.format(cur_page, count)
                info['orderid'] = item.attrs['data-orderid']
                info['status'] = item.attrs['data-status']
                # 店铺
                node = item.select('tr.order-hd a.shopname')
                if not node:
                    # 店铺不存在,可能是赠送彩票订单,忽略
                    # print('ignore')
                    continue
                info['shop_name'] = node[0].attrs['title'].strip()
                info['shop_url'] = node[0].attrs['href']
                # 日期
                node = item.select('tr.order-hd span.dealtime')[0]
                info['date'] = datetime.strptime(node.attrs['title'], '%Y-%m-%d %H:%M')

                if end_date and info['date'].toordinal() > end_date.toordinal():
                    continue

                if start_date and info['date'].toordinal() < start_date.toordinal():
                    out_date = True
                    break

                # 宝贝
                baobei = []
                node = item.find_all('tr', class_='order-bd')
                # pprint(len(node))
                for n in node:
                    try:
                        bb = {}
                        if [True for ex in extra_service if ex in n.attrs['class']]:
                            # 额外服务处理
                            # print('额外服务处理')
                            name_node = n.find('td', class_='baobei')
                            # 宝贝地址
                            bb['name'] = name_node.text.strip()
                            bb['url'] = ''
                            bb['spec'] = ''
                            # 宝贝快照
                            bb['snapshot'] = ''
                            # 宝贝价格
                            bb['price'] = 0.0
                            # 宝贝数量
                            bb['quantity'] = 1
                            bb['is_goods'] = False
                            try:
                                bb['url'] = name_node.find('a').attrs['href']
                                bb['price'] = float(n.find('td', class_='price').text)
                            except:
                                pass
                        else:
                            name_node = n.select('p.baobei-name a')
                            # 宝贝地址
                            bb['name'] = name_node[0].text.strip()
                            bb['url'] = name_node[0].attrs['href']
                            # 宝贝快照
                            bb['snapshot'] = ''
                            if len(name_node) > 1:
                                bb['snapshot'] = name_node[1].attrs['href']
                            # 宝贝规格
                            bb['spec'] = n.select('.spec')[0].text.strip()
                            # 宝贝价格
                            bb['price'] = float(n.find('td', class_='price').attrs['title'])
                            # 宝贝数量
                            bb['quantity'] = int(n.find('td', class_='quantity').attrs['title'])
                            bb['is_goods'] = True
                        baobei.append(bb)
                        # 尝试获取实付款
                        # 实付款所在的节点可能跨越多个tr的td
                        amount_node = n.select('td.amount em.real-price')
                        if amount_node:
                            info['amount'] = float(amount_node[0].text)
                    except Exception as e:
                        errors.append({
                            'type' : 'baobei',
                            'id' : '{}.{}'.format(cur_page, count),
                            'node' : '{}'.format(n),
                            'error' : '{}'.format(e)
                        })
            except Exception as e:
                errors.append({
                    'type' : 'order',
                    'id' : '{}.{}'.format(cur_page, count),
                    'node' : '{}'.format(item),
                    'error' : '{}'.format(e)
                })

            info['baobei'] = baobei
            orders.append(info)

        stdout_cr('working... {:.0%}'.format(cur_page / total_pages))

        # 下一页
        cur_page += 1
        if cur_page > total_pages or out_date:
            break
        query_data.update({'pageNum' : cur_page})
        page_url = '{}?{}'.format(url, urllib.urlencode(query_data))
        res = urllib2.urlopen(page_url)
        soup = BeautifulSoup(res.read().decode('gbk'))

    stdout_cr()
    if errors:
        print('INFO. 有错误发生,统计结果可能不准确。')
        # pprint(errors)
    return orders

def output(orders, start_date, end_date):
    amount = 0.0
    org_amount = 0
    baobei_count = 0
    order_count = 0
    invaild_order_count = 0
    for order in orders:
        if order['status'] in INVALID_ORDER_STATES:
            invaild_order_count += 1
            continue
        amount += order['amount']
        order_count += 1
        for baobei in order.get('baobei', []):
            if not baobei['is_goods']:
                continue
            org_amount += baobei['price'] * baobei['quantity']
            baobei_count += baobei['quantity']

    print('{:<9} {}'.format('累计消费:', amount))
    print('{:<9} {}/{}'.format('订单/宝贝:', order_count, baobei_count))
    if invaild_order_count:
        print('{:<9} {} (退货或取消等, 不在上述订单之内)'.format('无效订单:', invaild_order_count))
    print('{:<7} {}'.format('宝贝原始总价:', org_amount))
    print('{:<7} {:.2f}'.format('宝贝平均单价:', 0 if baobei_count == 0 else org_amount / baobei_count))
    print('{:<9} {} ({:.2%})'.format('节约了(?):',
                                     org_amount - amount,
                                     0 if org_amount == 0 else (org_amount - amount) / org_amount))
    from_date = start_date if start_date else orders[-1]['date']
    to_date = end_date if end_date else datetime.now()
    print('{:<9} {:%Y-%m-%d} - {:%Y-%m-%d}'.format('统计区间:', from_date, to_date))
    if not start_date:
        print('{:<9} {:%Y-%m-%d %H:%M}'.format('败家始于:', orders[-1]['date']))

def ouput_orders(orders):
    print('所有订单:')
    if not orders:
        print('  --')
        return
    for order in orders:
        print('  {:-^20}'.format('-'))
        print('  * 订单号: {orderid}  实付款: {amount}  店铺: {shop_name}  时间: {date:%Y-%m-%d %H:%M}'.format(**order))
        for bb in order['baobei']:
            if not bb['is_goods']:
                continue
            print('    - {name}'.format(**bb))
            if bb['spec']:
                print('      {spec}'.format(**bb))
            print('      {price} X {quantity}'.format(**bb))

def main():
    parser = argparse.ArgumentParser(
        prog='python {}'.format(__file__)
    )
    parser.add_argument('-u', '--username', help='淘宝用户名')
    parser.add_argument('-p', '--password', help='淘宝密码')
    parser.add_argument('-s', '--start', help='起始时间,可选, 格式如: 2014-11-11')
    parser.add_argument('-e', '--end', help='结束时间,可选, 格式如: 2014-11-11')
    parser.add_argument('--verbose', action='store_true', default=False, help='订单详细输出')
    parser.add_argument('-v', '--version', action='version',
                        version='v{}'.format(__version__), help='版本号')
    args = parser.parse_args()

    usr = args.username
    if not usr:
        usr = raw_input('输入淘宝用户名: '.encode(RAW_IMPUT_ENCODING))
    usr = usr.decode('utf-8') # 中文输入问题
    pwd = args.password
    if not pwd:
        if platform.system() == 'Windows':
            # Windows下中文输出有问题
            pwd = getpass()
        else:
            pwd = getpass('输入淘宝密码: '.encode('utf-8'))

    pwd = pwd.decode('utf-8')

    verbose = args.verbose

    start_date = None
    if args.start:
        try:
            start_date = datetime.strptime(args.start, '%Y-%m-%d')
        except Exception as e:
            sys.exit('ERROR. {}'.format(e))

    end_date = None
    if args.end:
        try:
            end_date = datetime.strptime(args.end, '%Y-%m-%d')
        except Exception as e:
            sys.exit('ERROR. {}'.format(e))

    if start_date and end_date and start_date > end_date:
        sys.exit('ERROR, 结束日期必须晚于或等于开始日期')

    cj_file = './{}.tmp'.format(usr)
    cj = cookielib.LWPCookieJar()
    try:
        cj.load(cj_file)
    except:
        pass
    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj), urllib2.HTTPHandler)
    urllib2.install_opener(opener)

    login(usr, pwd)

    try:
        cj.save(cj_file)
    except:
        pass

    orders = parse_bought_list(start_date, end_date)
    output(orders, start_date, end_date)

    # 输出订单明细
    if verbose:
        ouput_orders(orders)

if __name__ == '__main__':
    main()

Alfred Workflow: 国内电视频道节目表

  一个方便查看国内电视频道节目单的Alfred Workflow,数据来自CCTV,包括全部140个频道,支持频道收藏。

App Dig Screenshot

DownloadSource

  • 输入关键词tv即可获得已被收藏频道当前及接下来将要播放的节目
  • 包括CCTV EPG上全部140个频道
  • 支持频道收藏

Alfred Workflow: App Dig

  AppShopper是一个很实用的iOS/Mac App价格信息搜集网站,App Dig基于它,可以很方便的在Alfed里获取如限时免费等App价格变更信息,同时也支持AppShopper的Wish List。

App Dig Screenshot

DownloadSource

  • 支持App列表中显示应用图标1
  • 设置AppShopper用户名并公开Wish List可显示愿望列表2

  1. 显示应用图标时,获得App列表的时间可能增加,可在app setting中选择禁用 

  2. 在AppShopper的Wish List页面选中'Share My Wishlist'即可公开你的愿望列表 

Read more...